All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Ralf-Peter Rohbeck <Ralf-Peter.Rohbeck@quantum.com>,
	Michal Hocko <mhocko@suse.cz>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: OOM killer changes
Date: Tue, 16 Aug 2016 09:44:34 +0200	[thread overview]
Message-ID: <b6cf97a4-9260-0b9a-d2f7-00905325d773@suse.cz> (raw)
In-Reply-To: <20160816031222.GC16913@js1304-P5Q-DELUXE>

On 08/16/2016 05:12 AM, Joonsoo Kim wrote:
> On Mon, Aug 15, 2016 at 11:16:36AM +0200, Vlastimil Babka wrote:
>> On 08/15/2016 06:48 AM, Ralf-Peter Rohbeck wrote:
>>> On 02.08.2016 12:25, Ralf-Peter Rohbeck wrote:
>>>>
>>> Took me a little longer than expected due to work. The failure wouldn't 
>>> happen for a while and so I started a couple of scripts and let them 
>>> run. When I checked today the server didn't respond on the network and 
>>> sure enough it had killed everything. This is with 4.7.0 with the config 
>>> based on Debian 4.7-rc7.
>>>
>>> trace_pipe got a little big (5GB) so I uploaded the logs to 
>>> https://filebin.net/box0wycfouvhl6sr/OOM_4.7.0.tar.bz2. before_btrfs is 
>>> before the btrfs filesystems were mounted.
>>> I did run a btrfs balance because it creates IO load and I needed to 
>>> balance anyway. Maybe that's what caused it?
>>
>> pgmigrate_success        46738962
>> pgmigrate_fail          135649772
>> compact_migrate_scanned 309726659
>> compact_free_scanned   9715615169
>> compact_isolated        229689596
>> compact_stall 4777
>> compact_fail 3068
>> compact_success 1709
>> compact_daemon_wake 207834
>>
>> The migration failures are quite enormous. Very quick analysis of the
>> trace seems to confirm that these are mostly "real", as opposed to result
>> of failure to isolate free pages for migration targets, although the free
>> scanner spent a lot of time:
> 
> I don't think that main reason of OOM is 'real' migration failure.
> If it is the case, compaction would find next migratable pages and
> eventually some of pages would be migrated successfully.
> 
> pagetypeinfo shows that there are too many unmovable pageblock.

Hmm, well spotted. And also somewhat suspicious, I would expect
filesystem activity to result in reclaimable allocations, not unmovable
(not that it makes any difference for compaction).

Checking nr_slab_* in zoneinfo shows that it really should be mostly
reclaimable:

nr_slab_reclaimable 0
nr_slab_unreclaimable 0
nr_slab_reclaimable 32709
nr_slab_unreclaimable 2764
nr_slab_reclaimable 101525
nr_slab_unreclaimable 10852

Compared with:

Number of blocks type     Unmovable      Movable  Reclaimable   HighAtomic      Isolate 
Node 0, zone      DMA            1            7            0            0            0 
Node 0, zone    DMA32          893           72           51            0            0 
Node 0, zone   Normal         2780          155          137            0            0 

We have 188 reclaimable blocks, that's 96256 pages. sum of nr_slab_reclaimable
is 134234, which suggests some fallbacks into unmovable blocks. But the rest
of all of those unmovable pageblocks must be filled by something else... some
btrfs buffers maybe?

> Freepage scanner don't scan those pageblocks so there is a large
> possibility that it cannot find freepages even if the system has many
> freepages. I think that this is the root cause of the problem.
> 
> It's better to check that following work-around help the problem.

Yes this might be good idea, minimally for higher compaction priorities.

Thanks.

> Thanks.
> 
> ------------>8-----------
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 9affb29..965eddd 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1082,10 +1082,6 @@ static void isolate_freepages(struct compact_control *cc)
>                 if (!page)
>                         continue;
>  
> -               /* Check the block is suitable for migration */
> -               if (!suitable_migration_target(page))
> -                       continue;
> -
>                 /* If isolation recently failed, do not retry */
>                 if (!isolation_suitable(cc, page))
>                         continue;
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-08-16  7:44 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <d8f3adcc-3607-1ef6-9ec5-82b2e125eef2@quantum.com>
2016-08-01  6:16 ` OOM killer changes Michal Hocko
     [not found]   ` <b1a39756-a0b5-1900-6575-d6e1f502cb26@Quantum.com>
     [not found]     ` <20160801182358.GB31957@dhcp22.suse.cz>
     [not found]       ` <30dbabc4-585c-55a5-9f3a-4e243c28356a@Quantum.com>
2016-08-01 19:26         ` Michal Hocko
2016-08-01 19:35           ` Ralf-Peter Rohbeck
2016-08-01 19:43             ` Michal Hocko
2016-08-01 19:52               ` Ralf-Peter Rohbeck
2016-08-01 20:09                 ` Michal Hocko
2016-08-01 20:16                   ` Ralf-Peter Rohbeck
2016-08-01 20:26                     ` Michal Hocko
2016-08-01 21:14                       ` Ralf-Peter Rohbeck
2016-08-01 21:27                         ` Ralf-Peter Rohbeck
2016-08-02  7:10                           ` Michal Hocko
2016-08-02 19:25                             ` Ralf-Peter Rohbeck
2016-08-15  4:48                               ` Ralf-Peter Rohbeck
2016-08-15  9:16                                 ` Vlastimil Babka
2016-08-15 15:01                                   ` Michal Hocko
2016-08-15 18:42                                     ` Ralf-Peter Rohbeck
2016-08-16  7:32                                       ` Michal Hocko
2016-08-16  7:43                                         ` Michal Hocko
2016-08-17  9:14                                           ` Ralf-Peter Rohbeck
2016-08-17  9:23                                             ` Vlastimil Babka
2016-08-17  9:28                                               ` Ralf-Peter Rohbeck
2016-08-17  9:33                                                 ` Michal Hocko
2016-08-17 23:37                                                   ` Ralf-Peter Rohbeck
2016-08-18  6:57                                                     ` Vlastimil Babka
2016-08-18 20:01                                                       ` Ralf-Peter Rohbeck
2016-08-18 20:12                                                         ` Vlastimil Babka
2016-08-19  2:42                                                           ` Ralf-Peter Rohbeck
2016-08-19  6:27                                                             ` Vlastimil Babka
2016-08-19  7:33                                                               ` Michal Hocko
2016-08-19  7:47                                                                 ` Vlastimil Babka
2016-08-19  8:26                                                                   ` Michal Hocko
2016-08-24 18:13                                                                     ` Ralf-Peter Rohbeck
2016-08-25  7:22                                                                       ` Michal Hocko
2016-08-25 20:35                                                                         ` Ralf-Peter Rohbeck
2016-08-26  8:35                                                                           ` Michal Hocko
2016-09-06 11:09                                                                             ` Vlastimil Babka
2016-08-23  5:02                                                               ` Joonsoo Kim
2016-08-23  7:45                                                                 ` Michal Hocko
2016-08-17  0:26                                         ` Ralf-Peter Rohbeck
2016-08-17  7:43                                           ` Vlastimil Babka
2016-08-16  3:12                                   ` Joonsoo Kim
2016-08-16  7:44                                     ` Vlastimil Babka [this message]
2016-08-17  4:48                                     ` Ralf-Peter Rohbeck
2016-08-17  7:56                                       ` Vlastimil Babka
2016-08-17  8:16                                         ` Joonsoo Kim
2016-08-17  9:21                                           ` Ralf-Peter Rohbeck
2016-08-17  9:11                                         ` Ralf-Peter Rohbeck
2016-08-17  9:20                                           ` Vlastimil Babka
2016-08-02  7:11           ` Vlastimil Babka
2016-08-02  9:02           ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b6cf97a4-9260-0b9a-d2f7-00905325d773@suse.cz \
    --to=vbabka@suse.cz \
    --cc=Ralf-Peter.Rohbeck@quantum.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.