All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Tony Luck <tony.luck@gmail.com>
Cc: fengguang.wu@intel.com, mingo@elte.hu, rostedt@goodmis.org,
	fweisbec@gmail.com, lwoodman@redhat.com, a.p.zijlstra@chello.nl,
	penberg@cs.helsinki.fi, eduard.munteanu@linux360.ro,
	linux-kernel@vger.kernel.org, kosaki.motohiro@jp.fujitsu.com,
	andi@firstfloor.org, mpm@selenic.com, adobriyan@gmail.com,
	linux-mm@kvack.org
Subject: Re: [PATCH 5/5] proc: export more page flags in /proc/kpageflags
Date: Tue, 28 Apr 2009 14:17:38 -0700	[thread overview]
Message-ID: <20090428141738.77e599f4.akpm@linux-foundation.org> (raw)
In-Reply-To: <12c511ca0904281111r10f37a5coe5a2750f4dbfbcda@mail.gmail.com>

On Tue, 28 Apr 2009 11:11:52 -0700
Tony Luck <tony.luck@gmail.com> wrote:

> On Tue, Apr 28, 2009 at 1:33 AM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> > 1) FAST
> >
> > It takes merely 0.2s to scan 4GB pages:
> >
> > __ __ __ __./page-types __0.02s user 0.20s system 99% cpu 0.216 total
> 
> OK on a tiny system ... but sounds painful on a big
> server. 0.2s for 4G scales up to 3 minutes 25 seconds
> on a 4TB system (4TB systems were being sold two
> years ago ... so by now the high end will have moved
> up to 8TB or perhaps 16TB).
> 
> Would the resulting output be anything but noise on
> a big system (a *lot* of pages can change state in
> 3 minutes)?
> 

Reading the state of all of memory in this fashion would be a somewhat
peculiar thing to do.  Bear in mind that kpagemap and friends are also
designed to allow userspace to inspect the state of a particular
process's memory.

Documentation/vm/pagemap.txt describes it nicely:

: The general procedure for using pagemap to find out about a process' memory
: usage goes like this:
: 
:  1. Read /proc/pid/maps to determine which parts of the memory space are
:     mapped to what.
:  2. Select the maps you are interested in -- all of them, or a particular
:     library, or the stack or the heap, etc.
:  3. Open /proc/pid/pagemap and seek to the pages you would like to examine.
:  4. Read a u64 for each page from pagemap.
:  5. Open /proc/kpagecount and/or /proc/kpageflags.  For each PFN you just
:     read, seek to that entry in the file, and read the data you want.

although I expect that this is not the use case when the feature is
being used to debug/tune readahead.

But yes, if you have huge amounts of memory and you decide to write an
application which inspects the state of every physical page in the
machine, you can expect it to take a long time!

Of course, the VM does also accumulate bulk aggregated page statistics
and presents them in /proc/meminfo, /proc/vmstat and probably other
places.  These numbers are maintained at runtime and the cost of doing
this is significant.

I don't _think_ there are presently any such counters which are
accumulated simply for instrumentation purposes - the kernel needs to
maintain them anyway for various reasons and it's a simple (and useful)
matter to make them available to userspace.


Generally, I think that pagemap is another of those things where we've
failed on the follow-through.  There's a nice and powerful interface
for inspecting the state of a process's VM, but nobody knows about it
and there are no tools for accessing it and nobody is using it.

(Or maybe I'm wrong about that - I expect I'd have bugged Matt about
this and I expect that he'd have done something.  Brain failed).

Either way, I think we'd serve the world better if we were to have some
nice little userspace tools which users could use to access this
information.  Documentation/vm already has a Makefile!

Fengguang, you mention an executable called "page-types".  Perhaps you
could "productise" that sometime?

A model here is Documentation/accounting/getdelays.c - that proved
quite useful and successful in the development of taskstats and I know
that several people are actually using getdelays.c as-is in serious
production environments.  If we hadn't provided and maintained that
code in the kernel tree, it's unlikely that taskstats would have proved
as useful to users.


WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Tony Luck <tony.luck@gmail.com>
Cc: fengguang.wu@intel.com, mingo@elte.hu, rostedt@goodmis.org,
	fweisbec@gmail.com, lwoodman@redhat.com, a.p.zijlstra@chello.nl,
	penberg@cs.helsinki.fi, eduard.munteanu@linux360.ro,
	linux-kernel@vger.kernel.org, kosaki.motohiro@jp.fujitsu.com,
	andi@firstfloor.org, mpm@selenic.com, adobriyan@gmail.com,
	linux-mm@kvack.org
Subject: Re: [PATCH 5/5] proc: export more page flags in /proc/kpageflags
Date: Tue, 28 Apr 2009 14:17:38 -0700	[thread overview]
Message-ID: <20090428141738.77e599f4.akpm@linux-foundation.org> (raw)
In-Reply-To: <12c511ca0904281111r10f37a5coe5a2750f4dbfbcda@mail.gmail.com>

On Tue, 28 Apr 2009 11:11:52 -0700
Tony Luck <tony.luck@gmail.com> wrote:

> On Tue, Apr 28, 2009 at 1:33 AM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> > 1) FAST
> >
> > It takes merely 0.2s to scan 4GB pages:
> >
> > __ __ __ __./page-types __0.02s user 0.20s system 99% cpu 0.216 total
> 
> OK on a tiny system ... but sounds painful on a big
> server. 0.2s for 4G scales up to 3 minutes 25 seconds
> on a 4TB system (4TB systems were being sold two
> years ago ... so by now the high end will have moved
> up to 8TB or perhaps 16TB).
> 
> Would the resulting output be anything but noise on
> a big system (a *lot* of pages can change state in
> 3 minutes)?
> 

Reading the state of all of memory in this fashion would be a somewhat
peculiar thing to do.  Bear in mind that kpagemap and friends are also
designed to allow userspace to inspect the state of a particular
process's memory.

Documentation/vm/pagemap.txt describes it nicely:

: The general procedure for using pagemap to find out about a process' memory
: usage goes like this:
: 
:  1. Read /proc/pid/maps to determine which parts of the memory space are
:     mapped to what.
:  2. Select the maps you are interested in -- all of them, or a particular
:     library, or the stack or the heap, etc.
:  3. Open /proc/pid/pagemap and seek to the pages you would like to examine.
:  4. Read a u64 for each page from pagemap.
:  5. Open /proc/kpagecount and/or /proc/kpageflags.  For each PFN you just
:     read, seek to that entry in the file, and read the data you want.

although I expect that this is not the use case when the feature is
being used to debug/tune readahead.

But yes, if you have huge amounts of memory and you decide to write an
application which inspects the state of every physical page in the
machine, you can expect it to take a long time!

Of course, the VM does also accumulate bulk aggregated page statistics
and presents them in /proc/meminfo, /proc/vmstat and probably other
places.  These numbers are maintained at runtime and the cost of doing
this is significant.

I don't _think_ there are presently any such counters which are
accumulated simply for instrumentation purposes - the kernel needs to
maintain them anyway for various reasons and it's a simple (and useful)
matter to make them available to userspace.


Generally, I think that pagemap is another of those things where we've
failed on the follow-through.  There's a nice and powerful interface
for inspecting the state of a process's VM, but nobody knows about it
and there are no tools for accessing it and nobody is using it.

(Or maybe I'm wrong about that - I expect I'd have bugged Matt about
this and I expect that he'd have done something.  Brain failed).

Either way, I think we'd serve the world better if we were to have some
nice little userspace tools which users could use to access this
information.  Documentation/vm already has a Makefile!

Fengguang, you mention an executable called "page-types".  Perhaps you
could "productise" that sometime?

A model here is Documentation/accounting/getdelays.c - that proved
quite useful and successful in the development of taskstats and I know
that several people are actually using getdelays.c as-is in serious
production environments.  If we hadn't provided and maintained that
code in the kernel tree, it's unlikely that taskstats would have proved
as useful to users.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2009-04-28 21:28 UTC|newest]

Thread overview: 137+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-28  1:09 [PATCH 0/5] proc: export more page flags in /proc/kpageflags (take 4) Wu Fengguang
2009-04-28  1:09 ` Wu Fengguang
2009-04-28  1:09 ` [PATCH 1/5] pagemap: document clarifications Wu Fengguang
2009-04-28  1:09   ` Wu Fengguang
2009-04-28  7:11   ` Tommi Rantala
2009-04-28  7:11     ` Tommi Rantala
2009-04-28  1:09 ` [PATCH 2/5] pagemap: documentation 9 more exported page flags Wu Fengguang
2009-04-28  1:09   ` Wu Fengguang
2009-04-28  1:09 ` [PATCH 3/5] mm: introduce PageHuge() for testing huge/gigantic pages Wu Fengguang
2009-04-28  1:09   ` Wu Fengguang
2009-04-28  1:09 ` [PATCH 4/5] proc: kpagecount/kpageflags code cleanup Wu Fengguang
2009-04-28  1:09   ` Wu Fengguang
2009-04-28  1:09 ` [PATCH 5/5] proc: export more page flags in /proc/kpageflags Wu Fengguang
2009-04-28  1:09   ` Wu Fengguang
2009-04-28  6:55   ` Ingo Molnar
2009-04-28  6:55     ` Ingo Molnar
2009-04-28  7:40     ` Andi Kleen
2009-04-28  7:40       ` Andi Kleen
2009-04-28  9:04       ` Pekka Enberg
2009-04-28  9:04         ` Pekka Enberg
2009-04-28  9:10         ` Andi Kleen
2009-04-28  9:10           ` Andi Kleen
2009-04-28  9:15           ` Pekka Enberg
2009-04-28  9:15             ` Pekka Enberg
2009-04-28  9:15         ` Ingo Molnar
2009-04-28  9:15           ` Ingo Molnar
2009-04-28  9:19           ` Pekka Enberg
2009-04-28  9:19             ` Pekka Enberg
2009-04-28  9:25             ` Pekka Enberg
2009-04-28  9:25               ` Pekka Enberg
2009-04-28  9:36               ` Wu Fengguang
2009-04-28  9:36                 ` Wu Fengguang
2009-04-28  9:36               ` Ingo Molnar
2009-04-28  9:36                 ` Ingo Molnar
2009-04-28  9:57                 ` Pekka Enberg
2009-04-28  9:57                   ` Pekka Enberg
2009-04-28 10:10                   ` KOSAKI Motohiro
2009-04-28 10:10                     ` KOSAKI Motohiro
2009-04-28 10:21                     ` Pekka Enberg
2009-04-28 10:21                       ` Pekka Enberg
2009-04-28 10:56                       ` Ingo Molnar
2009-04-28 10:56                         ` Ingo Molnar
2009-04-28 11:09                         ` KOSAKI Motohiro
2009-04-28 11:09                           ` KOSAKI Motohiro
2009-04-28 12:42                           ` Ingo Molnar
2009-04-28 12:42                             ` Ingo Molnar
2009-04-28 11:03                   ` Ingo Molnar
2009-04-28 11:03                     ` Ingo Molnar
2009-04-28 17:42                 ` Matt Mackall
2009-04-28 17:42                   ` Matt Mackall
2009-04-28  9:29             ` Ingo Molnar
2009-04-28  9:29               ` Ingo Molnar
2009-04-28  9:34               ` KOSAKI Motohiro
2009-04-28  9:34                 ` KOSAKI Motohiro
2009-04-28  9:38                 ` Ingo Molnar
2009-04-28  9:38                   ` Ingo Molnar
2009-04-28  9:55                   ` Wu Fengguang
2009-04-28  9:55                     ` Wu Fengguang
2009-04-28 10:11                     ` KOSAKI Motohiro
2009-04-28 10:11                       ` KOSAKI Motohiro
2009-04-28 11:05                     ` Ingo Molnar
2009-04-28 11:05                       ` Ingo Molnar
2009-04-28 11:36                       ` Wu Fengguang
2009-04-28 11:36                         ` Wu Fengguang
2009-04-28 12:17                         ` [rfc] object collection tracing (was: [PATCH 5/5] proc: export more page flags in /proc/kpageflags) Ingo Molnar
2009-04-28 12:17                           ` Ingo Molnar
2009-04-28 13:31                           ` Wu Fengguang
2009-04-28 13:31                             ` Wu Fengguang
2009-05-12 13:01                             ` Frederic Weisbecker
2009-05-12 13:01                               ` Frederic Weisbecker
2009-05-17 13:36                               ` Wu Fengguang
2009-05-17 13:55                                 ` Frederic Weisbecker
2009-05-17 13:55                                   ` Frederic Weisbecker
2009-05-17 14:12                                   ` Wu Fengguang
2009-05-17 14:12                                     ` Wu Fengguang
2009-05-18 11:44                                 ` KOSAKI Motohiro
2009-05-18 11:44                                   ` KOSAKI Motohiro
2009-05-18 11:47                                   ` Wu Fengguang
2009-05-18 11:47                                     ` Wu Fengguang
2009-04-28 10:18                   ` [PATCH 5/5] proc: export more page flags in /proc/kpageflags Andi Kleen
2009-04-28 10:18                     ` Andi Kleen
2009-04-28  8:33     ` Wu Fengguang
2009-04-28  8:33       ` Wu Fengguang
2009-04-28  9:24       ` Ingo Molnar
2009-04-28  9:24         ` Ingo Molnar
2009-04-28 18:11       ` Tony Luck
2009-04-28 18:11         ` Tony Luck
2009-04-28 18:34         ` Matt Mackall
2009-04-28 18:34           ` Matt Mackall
2009-04-28 20:47           ` Tony Luck
2009-04-28 20:47             ` Tony Luck
2009-04-28 20:54             ` Andi Kleen
2009-04-28 20:54               ` Andi Kleen
2009-04-28 20:59             ` Matt Mackall
2009-04-28 20:59               ` Matt Mackall
2009-04-28 21:17         ` Andrew Morton [this message]
2009-04-28 21:17           ` Andrew Morton
2009-04-28 21:49           ` Matt Mackall
2009-04-28 21:49             ` Matt Mackall
2009-04-29  0:02             ` Robin Holt
2009-04-29  0:02               ` Robin Holt
2009-04-28 17:49   ` Matt Mackall
2009-04-28 17:49     ` Matt Mackall
2009-04-29  8:05     ` Wu Fengguang
2009-04-29  8:05       ` Wu Fengguang
2009-04-29 19:13       ` Matt Mackall
2009-04-29 19:13         ` Matt Mackall
2009-04-30  1:00         ` Wu Fengguang
2009-04-30  1:00           ` Wu Fengguang
2009-04-28 21:32   ` Andrew Morton
2009-04-28 21:32     ` Andrew Morton
2009-04-28 22:46     ` Matt Mackall
2009-04-28 22:46       ` Matt Mackall
2009-04-28 23:02       ` Andrew Morton
2009-04-28 23:02         ` Andrew Morton
2009-04-28 23:31         ` Matt Mackall
2009-04-28 23:31           ` Matt Mackall
2009-04-28 23:42           ` Andrew Morton
2009-04-28 23:42             ` Andrew Morton
2009-04-28 23:55             ` Matt Mackall
2009-04-28 23:55               ` Matt Mackall
2009-04-29  3:33               ` Wu Fengguang
2009-04-29  3:33                 ` Wu Fengguang
2009-04-29  2:38     ` Wu Fengguang
2009-04-29  2:38       ` Wu Fengguang
2009-04-29  2:55       ` Andrew Morton
2009-04-29  2:55         ` Andrew Morton
2009-04-29  3:48         ` Wu Fengguang
2009-04-29  3:48           ` Wu Fengguang
2009-04-29  5:09           ` Wu Fengguang
2009-04-29  5:09             ` Wu Fengguang
2009-04-29  4:41       ` Nathan Lynch
2009-04-29  4:41         ` Nathan Lynch
2009-04-29  4:41         ` Nathan Lynch
2009-04-29  4:50         ` Andrew Morton
2009-04-29  4:50           ` Andrew Morton
2009-04-29  4:50           ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090428141738.77e599f4.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=adobriyan@gmail.com \
    --cc=andi@firstfloor.org \
    --cc=eduard.munteanu@linux360.ro \
    --cc=fengguang.wu@intel.com \
    --cc=fweisbec@gmail.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lwoodman@redhat.com \
    --cc=mingo@elte.hu \
    --cc=mpm@selenic.com \
    --cc=penberg@cs.helsinki.fi \
    --cc=rostedt@goodmis.org \
    --cc=tony.luck@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.