> Hmm it's actually interesting to see GFP_TRANSHUGE there and not
> GFP_TRANSHUGE_LIGHT. What's your thp defrag setting? (cat
> /sys/kernel/mm/transparent_hugepage/enabled). Maybe it's set to
> "always", or there's a heavily faulting process that's using
> madvise(MADV_HUGEPAGE). If that's the case, setting it to "defer" or
> even "never" could be a workaround.

cat /sys/kernel/mm/transparent_hugepage/enabled
always [madvise] never

according to the docs this is the default
> "madvise" will enter direct reclaim like "always" but only for regions
> that are have used madvise(MADV_HUGEPAGE). This is the default behaviour.

would any change there kick in immediately, even when in the 100M/10G case?

> or there's a heavily faulting process that's using madvise(MADV_HUGEPAGE)

are you suggesting that a/one process can cause this?
how would one be able to identify it..? should killing it allow the cache
to be
populated again instantly? if yes, then I could start killing all processes
on the
host until there is improvement to observe.
so far I can tell that it is not the database server, since restarting it
did not help at all.

Please remember that, suggesting this, I can see how buffers (the 100MB
value)
are `oscillating`. When in the cache-useless state it jumps around
literally every second
from e.g. 100 to 102, then 99, 104, 85, 101, 105, 98, .. and so on, where
it always gets
closer from well-populated several GB in the beginning to those 100MB over
the days.
so doing anything that should cause an effect would be easily measurable
instantly,
which is to date only achieved by dropping caches.

Please tell me if you need any measurements again, when or at what state,
with code
snippets perhaps to fit your needs.


Am Do., 23. Aug. 2018 um 14:21 Uhr schrieb Michal Hocko <mhocko@suse.com>:

> On Thu 23-08-18 14:10:28, Vlastimil Babka wrote:
> > On 08/22/2018 10:02 PM, Marinko Catovic wrote:
> > >> It might be also interesting to do in the problematic state, instead
> of
> > >> dropping caches:
> > >>
> > >> - save snapshot of /proc/vmstat and /proc/pagetypeinfo
> > >> - echo 1 > /proc/sys/vm/compact_memory
> > >> - save new snapshot of /proc/vmstat and /proc/pagetypeinfo
> > >
> > > There was just a worstcase in progress, about 100MB/10GB were used,
> > > super-low perfomance, but could not see any improvement there after
> echo 1,
> > > I watches this for about 3 minutes, the cache usage did not change.
> > >
> > > pagetypeinfo before echo https://pastebin.com/MjSgiMRL
> > > pagetypeinfo 3min after echo https://pastebin.com/uWM6xGDd
> > >
> > > vmstat before echo https://pastebin.com/TjYSKNdE
> > > vmstat 3min after echo https://pastebin.com/MqTibEKi
> >
> > OK, that confirms compaction is useless here. Thanks.
> >
> > It also shows that all orders except order-9 are in fact plentiful.
> > Michal's earlier summary of the trace shows that most allocations are up
> > to order-3 and should be fine, the exception is THP:
> >
> >     277 9 GFP_TRANSHUGE|__GFP_THISNODE
>
> But please note that this is not from the time when the page cache
> dropped to the observed values. So we do not know what happened at the
> time.
>
> Anyway 277 THP pages paging out such a large page cache amount would be
> more than unexpected even for explicitly costly THP fault in methods.
> --
> Michal Hocko
> SUSE Labs
>