linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Marinko Catovic <marinko.catovic@gmail.com>,
	Michal Hocko <mhocko@suse.com>
Cc: linux-mm@kvack.org, Christopher Lameter <cl@linux.com>
Subject: Re: Caching/buffers become useless after some time
Date: Tue, 30 Oct 2018 18:00:23 +0100	[thread overview]
Message-ID: <98305976-612f-cf6d-1377-2f9f045710a9@suse.cz> (raw)
In-Reply-To: <CADF2uSr2V+6MosROF7dJjs_Pn_hR8u6Z+5bKPqXYUUKx=5knDg@mail.gmail.com>

On 10/30/18 5:08 PM, Marinko Catovic wrote:
>> One notable thing here is that there shouldn't be any reason to do the
>> direct reclaim when kswapd itself doesn't do anything. It could be
>> either blocked on something but I find it quite surprising to see it in
>> that state for the whole 1500s time period or we are simply not low on
>> free memory at all. That would point towards compaction triggered memory
>> reclaim which account as the direct reclaim as well. The direct
>> compaction triggered more than once a second in average. We shouldn't
>> really reclaim unless we are low on memory but repeatedly failing
>> compaction could just add up and reclaim a lot in the end. There seem to
>> be quite a lot of low order request as per your trace buffer

I realized that the fact that slabs grew so large might be very
relevant. It means a lot of unmovable pages, and while they are slowly
being freed, the remaining are scattered all over the memory, making it
impossible to successfully compact, until the slabs are almost
*completely* freed. It's in fact the theoretical worst case scenario for
compaction and fragmentation avoidance. Next time it would be nice to
also gather /proc/pagetypeinfo, and /proc/slabinfo to see what grew so
much there (probably dentries and inodes).

The question is why the problems happened some time later after the
unmovable pollution. The trace showed me that the structure of
allocations wrt order+flags as Michal breaks them down below, is not
significanly different in the last phase than in the whole trace.
Possibly the state of memory gradually changed so that the various
heuristics (fragindex, pageblock skip bits etc) resulted in compaction
being tried more than initially, eventually hitting a very bad corner case.

>> $ grep order trace-last-phase | sed 's@.*\(order=[0-9]*\).*gfp_flags=\(.*\)@\1 \2@' | sort | uniq -c
>>    1238 order=1 __GFP_HIGH|__GFP_ATOMIC|__GFP_NOWARN|__GFP_COMP|__GFP_THISNODE
>>    5812 order=1 __GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_COMP|__GFP_THISNODE
>>     121 order=1 __GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_THISNODE
>>      22 order=1 __GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_THISNODE
>>  395910 order=1 GFP_KERNEL_ACCOUNT|__GFP_ZERO
>>  783055 order=1 GFP_NOWAIT|__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_COMP|__GFP_ACCOUNT
>>    1060 order=1 __GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_THISNODE
>>    3278 order=2 __GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_COMP|__GFP_THISNODE
>>  797255 order=2 GFP_NOWAIT|__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_COMP|__GFP_ACCOUNT
>>   93524 order=3 GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC
>>  498148 order=3 GFP_NOWAIT|__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_COMP|__GFP_ACCOUNT
>>  243563 order=3 GFP_NOWAIT|__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP
>>      10 order=4 __GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_THISNODE
>>     114 order=7 __GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_THISNODE
>>   67621 order=9 GFP_TRANSHUGE|__GFP_THISNODE
>>
>> We can safely rule out NOWAIT and ATOMIC because those do not reclaim.
>> That leaves us with
>>    5812 order=1 __GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_COMP|__GFP_THISNODE
>>     121 order=1 __GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_THISNODE
>>      22 order=1 __GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_THISNODE
>>  395910 order=1 GFP_KERNEL_ACCOUNT|__GFP_ZERO

I suspect there are lots of short-lived processes, so these are probably
rapidly recycled and not causing compaction. It also seems to be pgd
allocation (2 pages due to PTI) not kernel stack?

>>    1060 order=1 __GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_THISNODE
>>    3278 order=2 __GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_COMP|__GFP_THISNODE
>>      10 order=4 __GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_THISNODE
>>     114 order=7 __GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_THISNODE
>>   67621 order=9 GFP_TRANSHUGE|__GFP_THISNODE

I would again suspect those. IIRC we already confirmed earlier that THP
defrag setting is madvise or madvise+defer, and there are
madvise(MADV_HUGEPAGE) using processes? Did you ever try changing defrag
to plain 'defer'?

>>
>> by large the kernel stack allocations are in lead. You can put some
>> relief by enabling CONFIG_VMAP_STACK. There is alos a notable number of
>> THP pages allocations. Just curious are you running on a NUMA machine?
>> If yes [1] might be relevant. Other than that nothing really jumped at
>> me.


> thanks a lot Vlastimil!

And Michal :)

> I would not really know whether this is a NUMA, it is some usual
> server running with a i7-8700
> and ECC RAM. How would I find out?

Please provide /proc/zoneinfo and we'll see.

> So I should do CONFIG_VMAP_STACK=y and try that..?

I suspect you already have it.

  reply	other threads:[~2018-10-30 17:03 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-11 13:18 Caching/buffers become useless after some time Marinko Catovic
2018-07-12 11:34 ` Michal Hocko
2018-07-13 15:48   ` Marinko Catovic
2018-07-16 15:53     ` Marinko Catovic
2018-07-16 16:23       ` Michal Hocko
2018-07-16 16:33         ` Marinko Catovic
2018-07-16 16:45           ` Michal Hocko
2018-07-20 22:03             ` Marinko Catovic
2018-07-27 11:15               ` Vlastimil Babka
2018-07-30 14:40                 ` Michal Hocko
2018-07-30 22:08                   ` Marinko Catovic
2018-08-02 16:15                     ` Vlastimil Babka
2018-08-03 14:13                       ` Marinko Catovic
2018-08-06  9:40                         ` Vlastimil Babka
2018-08-06 10:29                           ` Marinko Catovic
2018-08-06 12:00                             ` Michal Hocko
2018-08-06 15:37                               ` Christopher Lameter
2018-08-06 18:16                                 ` Michal Hocko
2018-08-09  8:29                                   ` Marinko Catovic
2018-08-21  0:36                                     ` Marinko Catovic
2018-08-21  6:49                                       ` Michal Hocko
2018-08-21  7:19                                         ` Vlastimil Babka
2018-08-22 20:02                                           ` Marinko Catovic
2018-08-23 12:10                                             ` Vlastimil Babka
2018-08-23 12:21                                               ` Michal Hocko
2018-08-24  0:11                                                 ` Marinko Catovic
2018-08-24  6:34                                                   ` Vlastimil Babka
2018-08-24  8:11                                                     ` Marinko Catovic
2018-08-24  8:36                                                       ` Vlastimil Babka
2018-08-29 14:54                                                         ` Marinko Catovic
2018-08-29 15:01                                                           ` Michal Hocko
2018-08-29 15:13                                                             ` Marinko Catovic
2018-08-29 15:27                                                               ` Michal Hocko
2018-08-29 16:44                                                                 ` Marinko Catovic
2018-10-22  1:19                                                                   ` Marinko Catovic
2018-10-23 17:41                                                                     ` Marinko Catovic
2018-10-26  5:48                                                                       ` Marinko Catovic
2018-10-26  8:01                                                                     ` Michal Hocko
2018-10-26 23:31                                                                       ` Marinko Catovic
2018-10-27  6:42                                                                         ` Michal Hocko
     [not found]                                                                     ` <6e3a9434-32f2-0388-e0c7-2bd1c2ebc8b1@suse.cz>
2018-10-30 15:30                                                                       ` Michal Hocko
2018-10-30 16:08                                                                         ` Marinko Catovic
2018-10-30 17:00                                                                           ` Vlastimil Babka [this message]
2018-10-30 18:26                                                                             ` Marinko Catovic
2018-10-31  7:34                                                                               ` Michal Hocko
2018-10-31  7:32                                                                             ` Michal Hocko
2018-10-31 13:40                                                                             ` Vlastimil Babka
2018-10-31 14:53                                                                               ` Marinko Catovic
2018-10-31 17:01                                                                                 ` Michal Hocko
2018-10-31 19:21                                                                                   ` Marinko Catovic
2018-11-01 13:23                                                                                     ` Michal Hocko
2018-11-01 22:46                                                                                       ` Marinko Catovic
2018-11-02  8:05                                                                                         ` Michal Hocko
2018-11-02 11:31                                                                                           ` Marinko Catovic
2018-11-02 11:49                                                                                             ` Michal Hocko
2018-11-02 12:22                                                                                               ` Vlastimil Babka
2018-11-02 12:41                                                                                                 ` Marinko Catovic
2018-11-02 13:13                                                                                                   ` Vlastimil Babka
2018-11-02 13:50                                                                                                     ` Marinko Catovic
2018-11-02 14:49                                                                                                       ` Vlastimil Babka
2018-11-02 14:59                                                                                 ` Vlastimil Babka
2018-11-30 12:01                                                                                   ` Marinko Catovic
2018-12-10 21:30                                                                                     ` Marinko Catovic
2018-12-10 21:47                                                                                       ` Michal Hocko
2018-10-31 13:12                                                                     ` Vlastimil Babka
2018-08-24  6:24                                                 ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98305976-612f-cf6d-1377-2f9f045710a9@suse.cz \
    --to=vbabka@suse.cz \
    --cc=cl@linux.com \
    --cc=linux-mm@kvack.org \
    --cc=marinko.catovic@gmail.com \
    --cc=mhocko@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).