All of lore.kernel.org
 help / color / mirror / Atom feed
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
To: Larry Woodman <lwoodman@redhat.com>
Cc: kosaki.motohiro@jp.fujitsu.com, "Rik van Riel" <riel@redhat.com>,
	"Ingo Molnar" <mingo@elte.hu>,
	"Fr馘駻ic Weisbecker" <fweisbec@gmail.com>,
	"Li Zefan" <lizf@cn.fujitsu.com>,
	"Pekka Enberg" <penberg@cs.helsinki.fi>,
	eduard.munteanu@linux360.ro, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, rostedt@goodmis.org
Subject: Re: [Patch] mm tracepoints update - use case.
Date: Mon, 22 Jun 2009 12:37:09 +0900 (JST)	[thread overview]
Message-ID: <20090622115756.21F3.A69D9226@jp.fujitsu.com> (raw)
In-Reply-To: <1245352954.3212.67.camel@dhcp-100-19-198.bos.redhat.com>

Hi

> Thanks for the feedback Kosaki!
> 
> 
> > Scenario 1. OOM killer happend. why? and who bring it?
> 
> Doesnt the showmem() and stack trace to the console when the OOM kill
> occurred show enough in the majority of cases?  I realize that direct
> alloc_pages() calls are not accounted for here but that can be really
> invasive.

showmem() display _result_ of memory usage and fragmentation.
but Administrator often need to know the _reason_.

Plus, kmemtrace already trace slab allocate/free activity.
You mean you think this is really invasive?


> > Scenario 2. page allocation failure by memory fragmentation
> 
> Are you talking about order>0 allocation failures here?  Most of the
> slabs are single page allocations now.

Yes, order>0.
but I confused. Why do you talk about slab, not page alloc?

Note, non-x86 architecture freqently use order-1 allocation for
making stack.



> > Scenario 3. try_to_free_pages() makes very long latency. why?
> 
> This is available in the mm tracepoints, they all include timestamps.

perhaps, no.
Administrator need to know the reason. not accumulated time. it's the result.

We can guess some reason
  - IO congestion
  - memory eating speed is fast than reclaim speed
  - memory fragmentation

but it's only guess. we often need to get data.


> > Scenario 4. sar output that free memory dramatically reduced at 10 minute ago, and
> >             it already recover now. What's happen?
> 
> Is this really important?  It would take buffering lots of data to
> figure out what happened in the past.

ok, my scenario description is a bit wrong.

if userland process explicitly  consume memory or explicitely write
many data, it is true.

Is this more appropriate?

"userland process take the same action periodically, but only 10 minute ago
free memory reduced, why?"



> >   - suspects
> >     - kernel memory leak
> 
> Other than direct callers to the page allocator isnt that covered with
> the kmemtrace stuff?

Yeah.
perhaps, kmemtrace enhance to cover page allocator is good approach.


> >     - userland memory leak
> 
> The mm tracepoints track all user space allocations and frees(perhaps
> too many?).

hmhm.


> 
> >     - stupid driver use too much memory
> 
> hopefully kmemtrace will catch this?

ditto.
I agree with kmemtrace enhancement is good idea.

> 
> >     - userland application suddenly start to use much memory
> 
> The mm tracepoints track all user space allocations and frees.

ok.


> >   - what information are valuable?
> >     - slab usage information (kmemtrace already does)
> >     - page allocator usage information
> >     - rss of all processes at oom happend
> >     - why recent try_to_free_pages() can't reclaim any page?
> 
> The counters in the mm tracepoints do give counts but not the reasons
> that the pagereclaim code fails.

That's very important key point. please don't ignore.




WARNING: multiple messages have this Message-ID (diff)
From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
To: Larry Woodman <lwoodman@redhat.com>
Cc: kosaki.motohiro@jp.fujitsu.com, "Rik van Riel" <riel@redhat.com>,
	"Ingo Molnar" <mingo@elte.hu>,
	"Fr馘駻ic Weisbecker" <fweisbec@gmail.com>,
	"Li Zefan" <lizf@cn.fujitsu.com>,
	"Pekka Enberg" <penberg@cs.helsinki.fi>,
	eduard.munteanu@linux360.ro, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, rostedt@goodmis.org
Subject: Re: [Patch] mm tracepoints update - use case.
Date: Mon, 22 Jun 2009 12:37:09 +0900 (JST)	[thread overview]
Message-ID: <20090622115756.21F3.A69D9226@jp.fujitsu.com> (raw)
In-Reply-To: <1245352954.3212.67.camel@dhcp-100-19-198.bos.redhat.com>

Hi

> Thanks for the feedback Kosaki!
> 
> 
> > Scenario 1. OOM killer happend. why? and who bring it?
> 
> Doesnt the showmem() and stack trace to the console when the OOM kill
> occurred show enough in the majority of cases?  I realize that direct
> alloc_pages() calls are not accounted for here but that can be really
> invasive.

showmem() display _result_ of memory usage and fragmentation.
but Administrator often need to know the _reason_.

Plus, kmemtrace already trace slab allocate/free activity.
You mean you think this is really invasive?


> > Scenario 2. page allocation failure by memory fragmentation
> 
> Are you talking about order>0 allocation failures here?  Most of the
> slabs are single page allocations now.

Yes, order>0.
but I confused. Why do you talk about slab, not page alloc?

Note, non-x86 architecture freqently use order-1 allocation for
making stack.



> > Scenario 3. try_to_free_pages() makes very long latency. why?
> 
> This is available in the mm tracepoints, they all include timestamps.

perhaps, no.
Administrator need to know the reason. not accumulated time. it's the result.

We can guess some reason
  - IO congestion
  - memory eating speed is fast than reclaim speed
  - memory fragmentation

but it's only guess. we often need to get data.


> > Scenario 4. sar output that free memory dramatically reduced at 10 minute ago, and
> >             it already recover now. What's happen?
> 
> Is this really important?  It would take buffering lots of data to
> figure out what happened in the past.

ok, my scenario description is a bit wrong.

if userland process explicitly  consume memory or explicitely write
many data, it is true.

Is this more appropriate?

"userland process take the same action periodically, but only 10 minute ago
free memory reduced, why?"



> >   - suspects
> >     - kernel memory leak
> 
> Other than direct callers to the page allocator isnt that covered with
> the kmemtrace stuff?

Yeah.
perhaps, kmemtrace enhance to cover page allocator is good approach.


> >     - userland memory leak
> 
> The mm tracepoints track all user space allocations and frees(perhaps
> too many?).

hmhm.


> 
> >     - stupid driver use too much memory
> 
> hopefully kmemtrace will catch this?

ditto.
I agree with kmemtrace enhancement is good idea.

> 
> >     - userland application suddenly start to use much memory
> 
> The mm tracepoints track all user space allocations and frees.

ok.


> >   - what information are valuable?
> >     - slab usage information (kmemtrace already does)
> >     - page allocator usage information
> >     - rss of all processes at oom happend
> >     - why recent try_to_free_pages() can't reclaim any page?
> 
> The counters in the mm tracepoints do give counts but not the reasons
> that the pagereclaim code fails.

That's very important key point. please don't ignore.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2009-06-22  3:38 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-21 22:45 [Patch] mm tracepoints update Larry Woodman
2009-04-21 22:45 ` Larry Woodman
2009-04-22  1:00 ` KOSAKI Motohiro
2009-04-22  1:00   ` KOSAKI Motohiro
2009-04-22  9:57   ` Ingo Molnar
2009-04-22  9:57     ` Ingo Molnar
2009-04-22 12:07     ` Larry Woodman
2009-04-22 12:07       ` Larry Woodman
2009-04-22 19:22       ` [Patch] mm tracepoints update - use case Larry Woodman
2009-04-23  0:48         ` KOSAKI Motohiro
2009-04-23  0:48           ` KOSAKI Motohiro
2009-04-23  4:50           ` Andrew Morton
2009-04-23  4:50             ` Andrew Morton
2009-04-23  8:42             ` Ingo Molnar
2009-04-23  8:42               ` Ingo Molnar
2009-04-23 11:47               ` Larry Woodman
2009-04-23 11:47                 ` Larry Woodman
2009-04-24 20:48                 ` Larry Woodman
2009-06-15 18:26           ` Rik van Riel
2009-06-15 18:26             ` Rik van Riel
2009-06-17 14:07             ` Larry Woodman
2009-06-18  7:57             ` KOSAKI Motohiro
2009-06-18  7:57               ` KOSAKI Motohiro
2009-06-18 19:22               ` Larry Woodman
2009-06-18 19:22                 ` Larry Woodman
2009-06-18 19:40                 ` Rik van Riel
2009-06-18 19:40                   ` Rik van Riel
2009-06-22  3:37                   ` KOSAKI Motohiro
2009-06-22  3:37                     ` KOSAKI Motohiro
2009-06-22 15:04                     ` Larry Woodman
2009-06-22 15:04                       ` Larry Woodman
2009-06-23  5:52                       ` KOSAKI Motohiro
2009-06-23  5:52                         ` KOSAKI Motohiro
2009-06-22  3:37                 ` KOSAKI Motohiro [this message]
2009-06-22  3:37                   ` KOSAKI Motohiro
2009-06-22 15:28                   ` Larry Woodman
2009-06-22 15:28                     ` Larry Woodman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090622115756.21F3.A69D9226@jp.fujitsu.com \
    --to=kosaki.motohiro@jp.fujitsu.com \
    --cc=eduard.munteanu@linux360.ro \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=lwoodman@redhat.com \
    --cc=mingo@elte.hu \
    --cc=penberg@cs.helsinki.fi \
    --cc=riel@redhat.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.