linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Ralf-Peter Rohbeck <Ralf-Peter.Rohbeck@quantum.com>,
	Michal Hocko <mhocko@suse.cz>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: OOM killer changes
Date: Tue, 16 Aug 2016 09:44:34 +0200	[thread overview]
Message-ID: <b6cf97a4-9260-0b9a-d2f7-00905325d773@suse.cz> (raw)
In-Reply-To: <20160816031222.GC16913@js1304-P5Q-DELUXE>

On 08/16/2016 05:12 AM, Joonsoo Kim wrote:
> On Mon, Aug 15, 2016 at 11:16:36AM +0200, Vlastimil Babka wrote:
>> On 08/15/2016 06:48 AM, Ralf-Peter Rohbeck wrote:
>>> On 02.08.2016 12:25, Ralf-Peter Rohbeck wrote:
>>>>
>>> Took me a little longer than expected due to work. The failure wouldn't 
>>> happen for a while and so I started a couple of scripts and let them 
>>> run. When I checked today the server didn't respond on the network and 
>>> sure enough it had killed everything. This is with 4.7.0 with the config 
>>> based on Debian 4.7-rc7.
>>>
>>> trace_pipe got a little big (5GB) so I uploaded the logs to 
>>> https://filebin.net/box0wycfouvhl6sr/OOM_4.7.0.tar.bz2. before_btrfs is 
>>> before the btrfs filesystems were mounted.
>>> I did run a btrfs balance because it creates IO load and I needed to 
>>> balance anyway. Maybe that's what caused it?
>>
>> pgmigrate_success        46738962
>> pgmigrate_fail          135649772
>> compact_migrate_scanned 309726659
>> compact_free_scanned   9715615169
>> compact_isolated        229689596
>> compact_stall 4777
>> compact_fail 3068
>> compact_success 1709
>> compact_daemon_wake 207834
>>
>> The migration failures are quite enormous. Very quick analysis of the
>> trace seems to confirm that these are mostly "real", as opposed to result
>> of failure to isolate free pages for migration targets, although the free
>> scanner spent a lot of time:
> 
> I don't think that main reason of OOM is 'real' migration failure.
> If it is the case, compaction would find next migratable pages and
> eventually some of pages would be migrated successfully.
> 
> pagetypeinfo shows that there are too many unmovable pageblock.

Hmm, well spotted. And also somewhat suspicious, I would expect
filesystem activity to result in reclaimable allocations, not unmovable
(not that it makes any difference for compaction).

Checking nr_slab_* in zoneinfo shows that it really should be mostly
reclaimable:

nr_slab_reclaimable 0
nr_slab_unreclaimable 0
nr_slab_reclaimable 32709
nr_slab_unreclaimable 2764
nr_slab_reclaimable 101525
nr_slab_unreclaimable 10852

Compared with:

Number of blocks type     Unmovable      Movable  Reclaimable   HighAtomic      Isolate 
Node 0, zone      DMA            1            7            0            0            0 
Node 0, zone    DMA32          893           72           51            0            0 
Node 0, zone   Normal         2780          155          137            0            0 

We have 188 reclaimable blocks, that's 96256 pages. sum of nr_slab_reclaimable
is 134234, which suggests some fallbacks into unmovable blocks. But the rest
of all of those unmovable pageblocks must be filled by something else... some
btrfs buffers maybe?

> Freepage scanner don't scan those pageblocks so there is a large
> possibility that it cannot find freepages even if the system has many
> freepages. I think that this is the root cause of the problem.
> 
> It's better to check that following work-around help the problem.

Yes this might be good idea, minimally for higher compaction priorities.

Thanks.

> Thanks.
> 
> ------------>8-----------
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 9affb29..965eddd 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1082,10 +1082,6 @@ static void isolate_freepages(struct compact_control *cc)
>                 if (!page)
>                         continue;
>  
> -               /* Check the block is suitable for migration */
> -               if (!suitable_migration_target(page))
> -                       continue;
> -
>                 /* If isolation recently failed, do not retry */
>                 if (!isolation_suitable(cc, page))
>                         continue;
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-08-16  7:44 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <d8f3adcc-3607-1ef6-9ec5-82b2e125eef2@quantum.com>
2016-08-01  6:16 ` OOM killer changes Michal Hocko
     [not found]   ` <b1a39756-a0b5-1900-6575-d6e1f502cb26@Quantum.com>
     [not found]     ` <20160801182358.GB31957@dhcp22.suse.cz>
     [not found]       ` <30dbabc4-585c-55a5-9f3a-4e243c28356a@Quantum.com>
2016-08-01 19:26         ` Michal Hocko
2016-08-01 19:35           ` Ralf-Peter Rohbeck
2016-08-01 19:43             ` Michal Hocko
2016-08-01 19:52               ` Ralf-Peter Rohbeck
2016-08-01 20:09                 ` Michal Hocko
2016-08-01 20:16                   ` Ralf-Peter Rohbeck
2016-08-01 20:26                     ` Michal Hocko
2016-08-01 21:14                       ` Ralf-Peter Rohbeck
2016-08-01 21:27                         ` Ralf-Peter Rohbeck
2016-08-02  7:10                           ` Michal Hocko
2016-08-02 19:25                             ` Ralf-Peter Rohbeck
2016-08-15  4:48                               ` Ralf-Peter Rohbeck
2016-08-15  9:16                                 ` Vlastimil Babka
2016-08-15 15:01                                   ` Michal Hocko
2016-08-15 18:42                                     ` Ralf-Peter Rohbeck
2016-08-16  7:32                                       ` Michal Hocko
2016-08-16  7:43                                         ` Michal Hocko
2016-08-17  9:14                                           ` Ralf-Peter Rohbeck
2016-08-17  9:23                                             ` Vlastimil Babka
2016-08-17  9:28                                               ` Ralf-Peter Rohbeck
2016-08-17  9:33                                                 ` Michal Hocko
2016-08-17 23:37                                                   ` Ralf-Peter Rohbeck
2016-08-18  6:57                                                     ` Vlastimil Babka
2016-08-18 20:01                                                       ` Ralf-Peter Rohbeck
2016-08-18 20:12                                                         ` Vlastimil Babka
2016-08-19  2:42                                                           ` Ralf-Peter Rohbeck
2016-08-19  6:27                                                             ` Vlastimil Babka
2016-08-19  7:33                                                               ` Michal Hocko
2016-08-19  7:47                                                                 ` Vlastimil Babka
2016-08-19  8:26                                                                   ` Michal Hocko
2016-08-24 18:13                                                                     ` Ralf-Peter Rohbeck
2016-08-25  7:22                                                                       ` Michal Hocko
2016-08-25 20:35                                                                         ` Ralf-Peter Rohbeck
2016-08-26  8:35                                                                           ` Michal Hocko
2016-09-06 11:09                                                                             ` Vlastimil Babka
2016-08-23  5:02                                                               ` Joonsoo Kim
2016-08-23  7:45                                                                 ` Michal Hocko
2016-08-17  0:26                                         ` Ralf-Peter Rohbeck
2016-08-17  7:43                                           ` Vlastimil Babka
2016-08-16  3:12                                   ` Joonsoo Kim
2016-08-16  7:44                                     ` Vlastimil Babka [this message]
2016-08-17  4:48                                     ` Ralf-Peter Rohbeck
2016-08-17  7:56                                       ` Vlastimil Babka
2016-08-17  8:16                                         ` Joonsoo Kim
2016-08-17  9:21                                           ` Ralf-Peter Rohbeck
2016-08-17  9:11                                         ` Ralf-Peter Rohbeck
2016-08-17  9:20                                           ` Vlastimil Babka
2016-08-02  7:11           ` Vlastimil Babka
2016-08-02  9:02           ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b6cf97a4-9260-0b9a-d2f7-00905325d773@suse.cz \
    --to=vbabka@suse.cz \
    --cc=Ralf-Peter.Rohbeck@quantum.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).