All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marinko Catovic <marinko.catovic@gmail.com>
To: Michal Hocko <mhocko@suse.com>,
	linux-mm@kvack.org, Vlastimil Babka <vbabka@suse.cz>,
	Christopher Lameter <cl@linux.com>
Subject: Re: Caching/buffers become useless after some time
Date: Fri, 26 Oct 2018 07:48:02 +0200	[thread overview]
Message-ID: <CADF2uSoqzy0g-0=G_aq2DBjeBgmBF4NwM2rvzEqACHOeL_paAw@mail.gmail.com> (raw)
In-Reply-To: <CADF2uSrh=sUwKN1WLGzkQ0V=2Fgn0B8TGh7pY-ARJOvYq7Yn1Q@mail.gmail.com>

Am Di., 23. Okt. 2018 um 19:41 Uhr schrieb Marinko Catovic
<marinko.catovic@gmail.com>:
>
> Am Mo., 22. Okt. 2018 um 03:19 Uhr schrieb Marinko Catovic
> <marinko.catovic@gmail.com>:
> >
> > Am Mi., 29. Aug. 2018 um 18:44 Uhr schrieb Marinko Catovic
> > <marinko.catovic@gmail.com>:
> > >
> > >
> > >> > one host is at a healthy state right now, I'd run that over there immediately.
> > >>
> > >> Let's see what we can get from here.
> > >
> > >
> > > oh well, that went fast. actually with having low values for buffers (around 100MB) with caches
> > > around 20G or so, the performance was nevertheless super-low, I really had to drop
> > > the caches right now. This is the first time I see it with caches >10G happening, but hopefully
> > > this also provides a clue for you.
> > >
> > > Just after starting the stats I reset from previously defer to madvise - I suspect that this somehow
> > > caused the rapid reaction, since a few minutes later I saw that the free RAM jumped from 5GB to 10GB,
> > > after that I went afk, returning to the pc since my monitoring systems went crazy telling me about downtime.
> > >
> > > If you think changing /sys/kernel/mm/transparent_hugepage/defrag back to its default, while it was
> > > on defer now for days, was a mistake, then please tell me.
> > >
> > > here you go: https://nofile.io/f/VqRg644AT01/vmstat.tar.gz
> > > trace_pipe: https://nofile.io/f/wFShvZScpvn/trace_pipe.gz
> > >
> >
> > There we go again.
> >
> > First of all, I have set up this monitoring on 1 host, as a matter of
> > fact it did not occur on that single
> > one for days and weeks now, so I set this up again on all the hosts
> > and it just happened again on another one.
> >
> > This issue is far from over, even when upgrading to the latest 4.18.12
> >
> > https://nofile.io/f/z2KeNwJSMDj/vmstat-2.zip
> > https://nofile.io/f/5ezPUkFWtnx/trace_pipe-2.gz
> >
> > Please note: the trace_pipe is quite big in size, but it covers a
> > full-RAM to unused-RAM within just ~24 hours,
> > the measurements were initiated right after echo 3 > drop_caches and
> > stopped when the RAM was unused
> > aka re-used after another echo 3 in the end.
> >
> > This issue is alive for about half a year now, any suggestions, hints
> > or solutions are greatly appreciated,
> > again, I can not possibly be the only one experiencing this, I just
> > may be among the few ones who actually
> > notice this and are indeed suffering from very poor performance with
> > lots of I/O on cache/buffers.
> >
> > Also, I'd like to ask for a workaround until this is fixed someday:
> > echo 3 > drop_caches can take a very
> > long time when the host is busy with I/O in the background. According
> > to some resources in the net I discovered
> > that dropping caches operates until some lower threshold is reached,
> > which is less and less likely, when the
> > host is really busy. Could one point out what threshold this is perhaps?
> > I was thinking of e.g. mm/vmscan.c
> >
> >  549 void drop_slab_node(int nid)
> >  550 {
> >  551         unsigned long freed;
> >  552
> >  553         do {
> >  554                 struct mem_cgroup *memcg = NULL;
> >  555
> >  556                 freed = 0;
> >  557                 do {
> >  558                         freed += shrink_slab(GFP_KERNEL, nid, memcg, 0);
> >  559                 } while ((memcg = mem_cgroup_iter(NULL, memcg,
> > NULL)) != NULL);
> >  560         } while (freed > 10);
> >  561 }
> >
> > ..would it make sense to increase > 10 here with, for example, > 100 ?
> > I could easily adjust this, or any other relevant threshold, since I
> > am compiling the kernel in use.
> >
> > I'd just like it to be able to finish dropping caches to achieve the
> > workaround here until this issue is fixed,
> > which as mentioned, can take hours on a busy host, causing the host to
> > hang (having low performance) since
> > buffers/caches are not used at that time while drop_caches is being
> > set to 3, until that freeing up is finished.
>
> by the way, it seems to happen on the one mentioned host on a daily
> basis now, like dropping
> to 100M/10G every 24 hours, so it is actually a lot easier now to
> capture relevant data/stats, since
> it occurs again and again right now.
>
> strangely, other hosts are currently not affected for days.
> So if there is anything you need to know, beside the vmstat and
> trace_pipe files, please let me know.

As it happened again now for the 2nd time within 2 days, and mainly on
the very same host I mentioned before and with the reports given with
my previous reply, I just wanted to point
out something that I observed: earlier I stated that the buffers were
really low and the caches as well - however, I just monitored for the
second or third time, that this applies to buffers way more
significantly than to caches. As an example: 50MB buffers were in use,
yet 10GB for caches, still leaving around 20GB or RAM totally unused.
Note: buffer/caches were surely around 5GB/35GB in the healthy state
before, so still both are getting lower.

So the performance dropped that much so all services on the host
basically stopped working since there was so much I/O wait, again. I
tried to summarize what file contents people asked me to post, so
besides the trace_pipe and vmstat-folder from my previos post, here
goes another with the others while in the 50MB buffers state:

cat /proc/pagetypeinfo https://pastebin.com/W1sJscsZ
cat /proc/slabinfo     https://pastebin.com/9ZPU3q7X
cat /proc/zoneinfo     https://pastebin.com/RMTwtXGr

Hopefully you can read something from this.
As always, feel free to ask whatever info you'd like me to share.

  reply	other threads:[~2018-10-26  5:48 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-11 13:18 Caching/buffers become useless after some time Marinko Catovic
2018-07-12 11:34 ` Michal Hocko
2018-07-13 15:48   ` Marinko Catovic
2018-07-16 15:53     ` Marinko Catovic
2018-07-16 16:23       ` Michal Hocko
2018-07-16 16:33         ` Marinko Catovic
2018-07-16 16:45           ` Michal Hocko
2018-07-20 22:03             ` Marinko Catovic
2018-07-27 11:15               ` Vlastimil Babka
2018-07-30 14:40                 ` Michal Hocko
2018-07-30 22:08                   ` Marinko Catovic
2018-08-02 16:15                     ` Vlastimil Babka
2018-08-03 14:13                       ` Marinko Catovic
2018-08-06  9:40                         ` Vlastimil Babka
2018-08-06 10:29                           ` Marinko Catovic
2018-08-06 12:00                             ` Michal Hocko
2018-08-06 15:37                               ` Christopher Lameter
2018-08-06 18:16                                 ` Michal Hocko
2018-08-09  8:29                                   ` Marinko Catovic
2018-08-21  0:36                                     ` Marinko Catovic
2018-08-21  6:49                                       ` Michal Hocko
2018-08-21  7:19                                         ` Vlastimil Babka
2018-08-22 20:02                                           ` Marinko Catovic
2018-08-23 12:10                                             ` Vlastimil Babka
2018-08-23 12:21                                               ` Michal Hocko
2018-08-24  0:11                                                 ` Marinko Catovic
2018-08-24  6:34                                                   ` Vlastimil Babka
2018-08-24  8:11                                                     ` Marinko Catovic
2018-08-24  8:36                                                       ` Vlastimil Babka
2018-08-29 14:54                                                         ` Marinko Catovic
2018-08-29 15:01                                                           ` Michal Hocko
2018-08-29 15:13                                                             ` Marinko Catovic
2018-08-29 15:27                                                               ` Michal Hocko
2018-08-29 16:44                                                                 ` Marinko Catovic
2018-10-22  1:19                                                                   ` Marinko Catovic
2018-10-23 17:41                                                                     ` Marinko Catovic
2018-10-26  5:48                                                                       ` Marinko Catovic [this message]
2018-10-26  8:01                                                                     ` Michal Hocko
2018-10-26 23:31                                                                       ` Marinko Catovic
2018-10-27  6:42                                                                         ` Michal Hocko
     [not found]                                                                     ` <6e3a9434-32f2-0388-e0c7-2bd1c2ebc8b1@suse.cz>
2018-10-30 15:30                                                                       ` Michal Hocko
2018-10-30 16:08                                                                         ` Marinko Catovic
2018-10-30 17:00                                                                           ` Vlastimil Babka
2018-10-30 18:26                                                                             ` Marinko Catovic
2018-10-31  7:34                                                                               ` Michal Hocko
2018-10-31  7:32                                                                             ` Michal Hocko
2018-10-31 13:40                                                                             ` Vlastimil Babka
2018-10-31 14:53                                                                               ` Marinko Catovic
2018-10-31 17:01                                                                                 ` Michal Hocko
2018-10-31 19:21                                                                                   ` Marinko Catovic
2018-11-01 13:23                                                                                     ` Michal Hocko
2018-11-01 22:46                                                                                       ` Marinko Catovic
2018-11-02  8:05                                                                                         ` Michal Hocko
2018-11-02 11:31                                                                                           ` Marinko Catovic
2018-11-02 11:49                                                                                             ` Michal Hocko
2018-11-02 12:22                                                                                               ` Vlastimil Babka
2018-11-02 12:41                                                                                                 ` Marinko Catovic
2018-11-02 13:13                                                                                                   ` Vlastimil Babka
2018-11-02 13:50                                                                                                     ` Marinko Catovic
2018-11-02 14:49                                                                                                       ` Vlastimil Babka
2018-11-02 14:59                                                                                 ` Vlastimil Babka
2018-11-30 12:01                                                                                   ` Marinko Catovic
2018-12-10 21:30                                                                                     ` Marinko Catovic
2018-12-10 21:47                                                                                       ` Michal Hocko
2018-10-31 13:12                                                                     ` Vlastimil Babka
2018-08-24  6:24                                                 ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADF2uSoqzy0g-0=G_aq2DBjeBgmBF4NwM2rvzEqACHOeL_paAw@mail.gmail.com' \
    --to=marinko.catovic@gmail.com \
    --cc=cl@linux.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.