All of lore.kernel.org
 help / color / mirror / Atom feed
From: Florian Weimer <fw@deneb.enyo.de>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com, linux-xfs@vger.kernel.org, linux-mm@vkack.org,
	Michal Hocko <mhocko@kernel.org>
Subject: Re: Excessive xfs_inode allocations trigger OOM killer
Date: Wed, 21 Sep 2016 07:45:20 +0200	[thread overview]
Message-ID: <8737ktlsb3.fsf@mid.deneb.enyo.de> (raw)
In-Reply-To: <20160920214612.GJ340@dastard> (Dave Chinner's message of "Wed, 21 Sep 2016 07:46:12 +1000")

* Dave Chinner:

> [cc Michal, linux-mm@kvack.org]
>
> On Tue, Sep 20, 2016 at 10:56:31PM +0200, Florian Weimer wrote:
>> * Dave Chinner:
>> 
>> >>   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME 
>> >> 4121208 4121177  99%    0.88K 1030302        4   4121208K xfs_inode
>> >> 986286 985229  99%    0.19K  46966       21    187864K dentry
>> >> 723255 723076  99%    0.10K  18545       39     74180K buffer_head
>> >> 270263 269251  99%    0.56K  38609        7    154436K radix_tree_node
>> >> 140310  67409  48%    0.38K  14031       10     56124K mnt_cache
>> >
>> > That's not odd at all. It means your workload is visiting millions
>> > on inodes in your filesystem between serious memory pressure events.
>> 
>> Okay.
>> 
>> >> (I have attached the /proc/meminfo contents in case it offers further
>> >> clues.)
>> >> 
>> >> Confronted with large memory allocations (from “make -j12” and
>> >> compiling GCC, so perhaps ~8 GiB of memory), the OOM killer kicks in
>> >> and kills some random process.  I would have expected that some
>> >> xfs_inodes are freed instead.
>> >
>> > The oom killer is unreliable and often behaves very badly, and
>> > that's typicaly not an XFS problem.
>> >
>> > What is the full output off the oom killer invocations from dmesg?
>> 
>> I've attached the dmesg output (two events).
>
> Copied from the traces you attached (I've left them intact below for
> reference):
>
>> [51669.515086] make invoked oom-killer: gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=2, oom_score_adj=0
>> [51669.515092] CPU: 1 PID: 1202 Comm: make Tainted: G          I     4.7.1fw #1
>> [51669.515093] Hardware name: System manufacturer System Product Name/P6X58D-E, BIOS 0701    05/10/2011
>> [51669.515095]  0000000000000000 ffffffff812a7d39 0000000000000000 0000000000000000
>> [51669.515098]  ffffffff8114e4da ffff880018707d98 0000000000000000 000000000066ca81
>> [51669.515100]  ffffffff8170e88d ffffffff810fe69e ffff88033fc38728 0000000200000006
>> [51669.515102] Call Trace:
>> [51669.515108]  [<ffffffff812a7d39>] ? dump_stack+0x46/0x5d
>> [51669.515113]  [<ffffffff8114e4da>] ? dump_header.isra.12+0x51/0x176
>> [51669.515116]  [<ffffffff810fe69e>] ? oom_kill_process+0x32e/0x420
>> [51669.515119]  [<ffffffff811003a0>] ? page_alloc_cpu_notify+0x40/0x40
>> [51669.515120]  [<ffffffff810fdcdc>] ? find_lock_task_mm+0x2c/0x70
>> [51669.515122]  [<ffffffff810fea6d>] ? out_of_memory+0x28d/0x2d0
>> [51669.515125]  [<ffffffff81103137>] ? __alloc_pages_nodemask+0xb97/0xc90
>> [51669.515128]  [<ffffffff81076d9c>] ? copy_process.part.54+0xec/0x17a0
>> [51669.515131]  [<ffffffff81123318>] ? handle_mm_fault+0xaa8/0x1900
>> [51669.515133]  [<ffffffff81078614>] ? _do_fork+0xd4/0x320
>> [51669.515137]  [<ffffffff81084ecc>] ? __set_current_blocked+0x2c/0x40
>> [51669.515140]  [<ffffffff810013ce>] ? do_syscall_64+0x3e/0x80
>> [51669.515144]  [<ffffffff8151433c>] ? entry_SYSCALL64_slow_path+0x25/0x25
> .....
>> [51669.515194] DMA: 1*4kB (U) 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15900kB
>> [51669.515202] DMA32: 45619*4kB (UME) 73*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 183060kB
>> [51669.515209] Normal: 39979*4kB (UE) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 159916kB
> .....
>
> Alright, that's what I suspected. high order allocation for a new
> kernel stack and memory is so fragmented that a contiguous
> allocation fails. Really, this is a memory reclaim issue, not an XFS
> issue.  There is lots of reclaimable memory available, but memory
> reclaim is:
>
> 	a) not trying hard enough to reclaim reclaimable memory; and
> 	b) not waiting for memory compaction to rebuild contiguous
> 	   memory regions for high order allocations.
>
> Instead, it is declaring OOM and kicking the killer to free memory
> held busy userspace.

Thanks.

I have put the full kernel config here:

  <http://static.enyo.de/fw/volatile/config-4.7.1fw>

WARNING: multiple messages have this Message-ID (diff)
From: Florian Weimer <fw@deneb.enyo.de>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com, linux-xfs@vger.kernel.org, linux-mm@kvack.org,
	Michal Hocko <mhocko@kernel.org>
Subject: Re: Excessive xfs_inode allocations trigger OOM killer
Date: Wed, 21 Sep 2016 07:45:20 +0200	[thread overview]
Message-ID: <8737ktlsb3.fsf@mid.deneb.enyo.de> (raw)
In-Reply-To: <20160920214612.GJ340@dastard> (Dave Chinner's message of "Wed, 21 Sep 2016 07:46:12 +1000")

* Dave Chinner:

> [cc Michal, linux-mm@kvack.org]
>
> On Tue, Sep 20, 2016 at 10:56:31PM +0200, Florian Weimer wrote:
>> * Dave Chinner:
>> 
>> >>   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME 
>> >> 4121208 4121177  99%    0.88K 1030302        4   4121208K xfs_inode
>> >> 986286 985229  99%    0.19K  46966       21    187864K dentry
>> >> 723255 723076  99%    0.10K  18545       39     74180K buffer_head
>> >> 270263 269251  99%    0.56K  38609        7    154436K radix_tree_node
>> >> 140310  67409  48%    0.38K  14031       10     56124K mnt_cache
>> >
>> > That's not odd at all. It means your workload is visiting millions
>> > on inodes in your filesystem between serious memory pressure events.
>> 
>> Okay.
>> 
>> >> (I have attached the /proc/meminfo contents in case it offers further
>> >> clues.)
>> >> 
>> >> Confronted with large memory allocations (from “make -j12” and
>> >> compiling GCC, so perhaps ~8 GiB of memory), the OOM killer kicks in
>> >> and kills some random process.  I would have expected that some
>> >> xfs_inodes are freed instead.
>> >
>> > The oom killer is unreliable and often behaves very badly, and
>> > that's typicaly not an XFS problem.
>> >
>> > What is the full output off the oom killer invocations from dmesg?
>> 
>> I've attached the dmesg output (two events).
>
> Copied from the traces you attached (I've left them intact below for
> reference):
>
>> [51669.515086] make invoked oom-killer: gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=2, oom_score_adj=0
>> [51669.515092] CPU: 1 PID: 1202 Comm: make Tainted: G          I     4.7.1fw #1
>> [51669.515093] Hardware name: System manufacturer System Product Name/P6X58D-E, BIOS 0701    05/10/2011
>> [51669.515095]  0000000000000000 ffffffff812a7d39 0000000000000000 0000000000000000
>> [51669.515098]  ffffffff8114e4da ffff880018707d98 0000000000000000 000000000066ca81
>> [51669.515100]  ffffffff8170e88d ffffffff810fe69e ffff88033fc38728 0000000200000006
>> [51669.515102] Call Trace:
>> [51669.515108]  [<ffffffff812a7d39>] ? dump_stack+0x46/0x5d
>> [51669.515113]  [<ffffffff8114e4da>] ? dump_header.isra.12+0x51/0x176
>> [51669.515116]  [<ffffffff810fe69e>] ? oom_kill_process+0x32e/0x420
>> [51669.515119]  [<ffffffff811003a0>] ? page_alloc_cpu_notify+0x40/0x40
>> [51669.515120]  [<ffffffff810fdcdc>] ? find_lock_task_mm+0x2c/0x70
>> [51669.515122]  [<ffffffff810fea6d>] ? out_of_memory+0x28d/0x2d0
>> [51669.515125]  [<ffffffff81103137>] ? __alloc_pages_nodemask+0xb97/0xc90
>> [51669.515128]  [<ffffffff81076d9c>] ? copy_process.part.54+0xec/0x17a0
>> [51669.515131]  [<ffffffff81123318>] ? handle_mm_fault+0xaa8/0x1900
>> [51669.515133]  [<ffffffff81078614>] ? _do_fork+0xd4/0x320
>> [51669.515137]  [<ffffffff81084ecc>] ? __set_current_blocked+0x2c/0x40
>> [51669.515140]  [<ffffffff810013ce>] ? do_syscall_64+0x3e/0x80
>> [51669.515144]  [<ffffffff8151433c>] ? entry_SYSCALL64_slow_path+0x25/0x25
> .....
>> [51669.515194] DMA: 1*4kB (U) 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15900kB
>> [51669.515202] DMA32: 45619*4kB (UME) 73*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 183060kB
>> [51669.515209] Normal: 39979*4kB (UE) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 159916kB
> .....
>
> Alright, that's what I suspected. high order allocation for a new
> kernel stack and memory is so fragmented that a contiguous
> allocation fails. Really, this is a memory reclaim issue, not an XFS
> issue.  There is lots of reclaimable memory available, but memory
> reclaim is:
>
> 	a) not trying hard enough to reclaim reclaimable memory; and
> 	b) not waiting for memory compaction to rebuild contiguous
> 	   memory regions for high order allocations.
>
> Instead, it is declaring OOM and kicking the killer to free memory
> held busy userspace.

Thanks.

I have put the full kernel config here:

  <http://static.enyo.de/fw/volatile/config-4.7.1fw>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-09-21  6:15 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-20 19:48 Excessive xfs_inode allocations trigger OOM killer Florian Weimer
2016-09-20 20:30 ` Dave Chinner
2016-09-20 20:56   ` Florian Weimer
2016-09-20 21:46     ` Dave Chinner
2016-09-21  5:45       ` Florian Weimer [this message]
2016-09-21  5:45         ` Florian Weimer
2016-09-21  5:45       ` Florian Weimer
2016-09-21  8:04       ` Michal Hocko
2016-09-21  8:06         ` Michal Hocko
2016-09-21  8:06           ` Michal Hocko
2016-09-26 17:33         ` Florian Weimer
2016-09-26 17:33           ` Florian Weimer
2016-09-26 20:02           ` Michal Hocko
2016-09-26 20:02             ` Michal Hocko
2016-10-03 17:35             ` Florian Weimer
2016-10-03 17:35               ` Florian Weimer
2016-10-03 17:54               ` Michal Hocko
2016-10-03 17:54                 ` Michal Hocko
2016-09-20 21:51 ` Christoph Hellwig
2016-09-20 21:54   ` Florian Weimer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8737ktlsb3.fsf@mid.deneb.enyo.de \
    --to=fw@deneb.enyo.de \
    --cc=david@fromorbit.com \
    --cc=linux-mm@vkack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.