From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) by kanga.kvack.org (Postfix) with ESMTP id C51156B2EA2 for ; Fri, 24 Aug 2018 04:11:22 -0400 (EDT) Received: by mail-wr1-f72.google.com with SMTP id a37-v6so7147764wrc.5 for ; Fri, 24 Aug 2018 01:11:22 -0700 (PDT) Received: from mail-sor-f41.google.com (mail-sor-f41.google.com. [209.85.220.41]) by mx.google.com with SMTPS id h5-v6sor2480928wrm.83.2018.08.24.01.11.21 for (Google Transport Security); Fri, 24 Aug 2018 01:11:21 -0700 (PDT) MIME-Version: 1.0 References: <20180806120042.GL19540@dhcp22.suse.cz> <010001650fe29e66-359ffa28-9290-4e83-a7e2-b6d1d8d2ee1d-000000@email.amazonses.com> <20180806181638.GE10003@dhcp22.suse.cz> <20180821064911.GW29735@dhcp22.suse.cz> <11b4f8cd-6253-262f-4ae6-a14062c58039@suse.cz> <6ef03395-6baa-a6e5-0d5a-63d4721e6ec0@suse.cz> <20180823122111.GG29735@dhcp22.suse.cz> <76c6e92b-df49-d4b5-27f7-5f2013713727@suse.cz> In-Reply-To: <76c6e92b-df49-d4b5-27f7-5f2013713727@suse.cz> From: Marinko Catovic Date: Fri, 24 Aug 2018 10:11:09 +0200 Message-ID: Subject: Re: Caching/buffers become useless after some time Content-Type: multipart/alternative; boundary="000000000000bfca56057429eb2d" Sender: owner-linux-mm@kvack.org List-ID: To: Vlastimil Babka Cc: Michal Hocko , Christopher Lameter , linux-mm@kvack.org --000000000000bfca56057429eb2d Content-Type: text/plain; charset="UTF-8" > > 1. Send the current value of /sys/kernel/mm/transparent_hugepage/defrag > 2. Unless it's 'defer' or 'never' already, try changing it to 'defer'. > /sys/kernel/mm/transparent_hugepage/defrag is always defer defer+madvise [madvise] never I *think* I already played around with these values, as far as I remember `never` almost caused the system to hang, or at least while I switched back to madvise. shall I switch it to defer and observe (all hosts are running fine by just now) or switch to defer while it is in the bad state? and when doing this, should improvement be measurable immediately? I need to know how long to hold this, before dropping caches becomes necessary. > Ah, checked the trace and it seems to be "php-cgi". Interesting that > they use madvise(MADV_HUGEPAGE). Anyway the above still applies. you know, that's at least an interesting hint. look at this: https://ckon.wordpress.com/2015/09/18/php7-opcache-performance/ this was experimental there, but a more recent version seems to have it on by default, since I need to disable it on request (implies to me that it is on by default). it is however *disabled* in the runtime configuration (and not in effect, I just confirmed that) It would be interesting to know whether madvise(MADV_HUGEPAGE) is then active somewhere else, since it is in the dump as you observed. Please note that `killing` php-cgi would not make any difference then, since these processes are started by request for every user and killed after whatever script is finished. this may invoke about 10-50 forks, depending on load, (with different system users) every second. That also *may* explain why it is not so much deterministic (sometimes earlier/sooner, sometimes on one host and not on the other), since there are multiple php-cgi versions available and not everyone is using the same version - most people stick to legacy versions. --000000000000bfca56057429eb2d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
1. Send the current value of /sys/kernel/mm/transparent_hugepage/defrag
2. Unless it's 'defer' or 'never' already, try changing= it to 'defer'.

=C2=A0/sys/kern= el/mm/transparent_hugepage/defrag is
always defer defer+madvi= se [madvise] never

I *think* I already played = around with these values, as far as I remember `never`
almost cau= sed the system to hang, or at least while I switched back to madvise.
=
shall I switch it to defer and observe (all hosts are running fine by = just now) or
switch to defer while it is in the bad state?
<= div>and when doing this, should improvement be measurable immediately?
<= /div>
I need to know how long to hold this, before dropping caches beco= mes necessary.

> Ah, checked the trace and = it seems to be "php-cgi". Interesting that
> they use madvi= se(MADV_HUGEPAGE). Anyway the above still applies.=

you know, that's at least an interesting hint. look = at this:

this was exp= erimental there, but a more recent version seems to have it on
=
by default, since I need to disable it on req= uest (implies to me that it is on by default).
it is however *disabled* in the runtime configuration (and no= t in effect, I just confirmed that)

It would be interesti= ng to know whether madvise(MADV_HUGEPAGE) is then active
<= span class=3D"gmail-im">somewhere else, since it is in the dump as you obse= rved.

Please note that `k= illing` php-cgi would not make any difference then, since these processes
are started by request for every u= ser and killed after whatever script is finished. this may
invoke about 10-50 forks, depending on load, (wit= h different system users) every second.

That also *may* e= xplain why it is not so much deterministic (sometimes earlier/sooner, somet= imes
on one host and not on the o= ther), since there are multiple php-cgi versions available
and not everyone is using the same version - most= people stick to legacy versions.

--000000000000bfca56057429eb2d--