All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Michal Hocko <mhocko@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH 2/3] oom, trace: Add oom detection tracepoints
Date: Wed, 14 Dec 2016 18:17:43 +0100	[thread overview]
Message-ID: <dc8350cf-4317-e4f7-7a26-b6a13e48c2eb@suse.cz> (raw)
In-Reply-To: <20161214145324.26261-3-mhocko@kernel.org>

On 12/14/2016 03:53 PM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>

I guess the Subject should be more specific to the tracepoint?

> should_reclaim_retry is the central decision point for declaring the
> OOM. It might be really useful to expose data used for this decision
> making when debugging an unexpected oom situations.
>
> Say we have an OOM report:
> [   52.264001] mem_eater invoked oom-killer: gfp_mask=0x24280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=0, order=0, oom_score_adj=0
> [   52.267549] CPU: 3 PID: 3148 Comm: mem_eater Tainted: G        W       4.8.0-oomtrace3-00006-gb21338b386d2 #1024
>
> Now we can check the tracepoint data to see how we have ended up in this
> situation:
>        mem_eater-3148  [003] ....    52.432801: reclaim_retry_zone: node=0 zone=DMA32 order=0 reclaimable=51 available=11134 min_wmark=11084 no_progress_loops=1 wmark_check=1
>        mem_eater-3148  [003] ....    52.433269: reclaim_retry_zone: node=0 zone=DMA32 order=0 reclaimable=51 available=11103 min_wmark=11084 no_progress_loops=1 wmark_check=1
>        mem_eater-3148  [003] ....    52.433712: reclaim_retry_zone: node=0 zone=DMA32 order=0 reclaimable=51 available=11100 min_wmark=11084 no_progress_loops=2 wmark_check=1
>        mem_eater-3148  [003] ....    52.434067: reclaim_retry_zone: node=0 zone=DMA32 order=0 reclaimable=51 available=11097 min_wmark=11084 no_progress_loops=3 wmark_check=1
>        mem_eater-3148  [003] ....    52.434414: reclaim_retry_zone: node=0 zone=DMA32 order=0 reclaimable=51 available=11094 min_wmark=11084 no_progress_loops=4 wmark_check=1
>        mem_eater-3148  [003] ....    52.434761: reclaim_retry_zone: node=0 zone=DMA32 order=0 reclaimable=51 available=11091 min_wmark=11084 no_progress_loops=5 wmark_check=1
>        mem_eater-3148  [003] ....    52.435108: reclaim_retry_zone: node=0 zone=DMA32 order=0 reclaimable=51 available=11087 min_wmark=11084 no_progress_loops=6 wmark_check=1
>        mem_eater-3148  [003] ....    52.435478: reclaim_retry_zone: node=0 zone=DMA32 order=0 reclaimable=51 available=11084 min_wmark=11084 no_progress_loops=7 wmark_check=0
>        mem_eater-3148  [003] ....    52.435478: reclaim_retry_zone: node=0 zone=DMA order=0 reclaimable=0 available=1126 min_wmark=179 no_progress_loops=7 wmark_check=0
>
> From the above we can quickly deduce that the reclaim stopped making
> any progress (see no_progress_loops increased in each round) and while
> there were still some 51 reclaimable pages they couldn't be dropped
> for some reason (vmscan trace points would tell us more about that
> part). available will represent reclaimable + free_pages scaled down per
> no_progress_loops factor. This is essentially an optimistic estimate of
> how much memory we would have when reclaiming everything.  This can be
> compared to min_wmark to get a rought idea but the wmark_check tells the
> result of the watermark check which is more precise (includes lowmem
> reserves, considers the order etc.). As we can see no zone is eligible
> in the end and that is why we have triggered the oom in this situation.
>
> Please note that higher order requests might fail on the wmark_check even
> when there is much more memory available than min_wmark - e.g. when the
> memory is fragmented. A follow up tracepoint will help to debug those
> situations.
>
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

WARNING: multiple messages have this Message-ID (diff)
From: Vlastimil Babka <vbabka@suse.cz>
To: Michal Hocko <mhocko@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@suse.com>
Subject: Re: [PATCH 2/3] oom, trace: Add oom detection tracepoints
Date: Wed, 14 Dec 2016 18:17:43 +0100	[thread overview]
Message-ID: <dc8350cf-4317-e4f7-7a26-b6a13e48c2eb@suse.cz> (raw)
In-Reply-To: <20161214145324.26261-3-mhocko@kernel.org>

On 12/14/2016 03:53 PM, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>

I guess the Subject should be more specific to the tracepoint?

> should_reclaim_retry is the central decision point for declaring the
> OOM. It might be really useful to expose data used for this decision
> making when debugging an unexpected oom situations.
>
> Say we have an OOM report:
> [   52.264001] mem_eater invoked oom-killer: gfp_mask=0x24280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=0, order=0, oom_score_adj=0
> [   52.267549] CPU: 3 PID: 3148 Comm: mem_eater Tainted: G        W       4.8.0-oomtrace3-00006-gb21338b386d2 #1024
>
> Now we can check the tracepoint data to see how we have ended up in this
> situation:
>        mem_eater-3148  [003] ....    52.432801: reclaim_retry_zone: node=0 zone=DMA32 order=0 reclaimable=51 available=11134 min_wmark=11084 no_progress_loops=1 wmark_check=1
>        mem_eater-3148  [003] ....    52.433269: reclaim_retry_zone: node=0 zone=DMA32 order=0 reclaimable=51 available=11103 min_wmark=11084 no_progress_loops=1 wmark_check=1
>        mem_eater-3148  [003] ....    52.433712: reclaim_retry_zone: node=0 zone=DMA32 order=0 reclaimable=51 available=11100 min_wmark=11084 no_progress_loops=2 wmark_check=1
>        mem_eater-3148  [003] ....    52.434067: reclaim_retry_zone: node=0 zone=DMA32 order=0 reclaimable=51 available=11097 min_wmark=11084 no_progress_loops=3 wmark_check=1
>        mem_eater-3148  [003] ....    52.434414: reclaim_retry_zone: node=0 zone=DMA32 order=0 reclaimable=51 available=11094 min_wmark=11084 no_progress_loops=4 wmark_check=1
>        mem_eater-3148  [003] ....    52.434761: reclaim_retry_zone: node=0 zone=DMA32 order=0 reclaimable=51 available=11091 min_wmark=11084 no_progress_loops=5 wmark_check=1
>        mem_eater-3148  [003] ....    52.435108: reclaim_retry_zone: node=0 zone=DMA32 order=0 reclaimable=51 available=11087 min_wmark=11084 no_progress_loops=6 wmark_check=1
>        mem_eater-3148  [003] ....    52.435478: reclaim_retry_zone: node=0 zone=DMA32 order=0 reclaimable=51 available=11084 min_wmark=11084 no_progress_loops=7 wmark_check=0
>        mem_eater-3148  [003] ....    52.435478: reclaim_retry_zone: node=0 zone=DMA order=0 reclaimable=0 available=1126 min_wmark=179 no_progress_loops=7 wmark_check=0
>
> From the above we can quickly deduce that the reclaim stopped making
> any progress (see no_progress_loops increased in each round) and while
> there were still some 51 reclaimable pages they couldn't be dropped
> for some reason (vmscan trace points would tell us more about that
> part). available will represent reclaimable + free_pages scaled down per
> no_progress_loops factor. This is essentially an optimistic estimate of
> how much memory we would have when reclaiming everything.  This can be
> compared to min_wmark to get a rought idea but the wmark_check tells the
> result of the watermark check which is more precise (includes lowmem
> reserves, considers the order etc.). As we can see no zone is eligible
> in the end and that is why we have triggered the oom in this situation.
>
> Please note that higher order requests might fail on the wmark_check even
> when there is much more memory available than min_wmark - e.g. when the
> memory is fragmented. A follow up tracepoint will help to debug those
> situations.
>
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-12-14 17:18 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-14 14:53 [PATCH 0/3] mm, oom: add oom detection tracepoints Michal Hocko
2016-12-14 14:53 ` Michal Hocko
2016-12-14 14:53 ` [PATCH 1/3] mm, trace: extract COMPACTION_STATUS and ZONE_TYPE to a common header Michal Hocko
2016-12-14 14:53   ` Michal Hocko
2016-12-14 17:14   ` Vlastimil Babka
2016-12-14 17:14     ` Vlastimil Babka
2016-12-14 17:32   ` kbuild test robot
2016-12-14 19:02     ` Michal Hocko
2016-12-14 19:02       ` Michal Hocko
2016-12-14 14:53 ` [PATCH 2/3] oom, trace: Add oom detection tracepoints Michal Hocko
2016-12-14 14:53   ` Michal Hocko
2016-12-14 17:17   ` Vlastimil Babka [this message]
2016-12-14 17:17     ` Vlastimil Babka
2016-12-14 14:53 ` [PATCH 3/3] oom, trace: add compaction retry tracepoint Michal Hocko
2016-12-14 14:53   ` Michal Hocko
2016-12-14 17:28   ` Vlastimil Babka
2016-12-14 17:28     ` Vlastimil Babka
2016-12-14 18:11     ` Michal Hocko
2016-12-14 18:11       ` Michal Hocko
2016-12-15  8:18       ` Vlastimil Babka
2016-12-15  8:18         ` Vlastimil Babka
2016-12-20 13:01 [PATCH 0/3 v2] mm, oom: add oom detection tracepoints Michal Hocko
2016-12-20 13:01 ` [PATCH 2/3] oom, trace: Add " Michal Hocko
2016-12-20 13:01   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dc8350cf-4317-e4f7-7a26-b6a13e48c2eb@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.