From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) by kanga.kvack.org (Postfix) with ESMTP id 584FA6B5812 for ; Fri, 30 Nov 2018 07:02:03 -0500 (EST) Received: by mail-wr1-f69.google.com with SMTP id p12so3842449wrt.17 for ; Fri, 30 Nov 2018 04:02:03 -0800 (PST) Received: from mail-sor-f41.google.com (mail-sor-f41.google.com. [209.85.220.41]) by mx.google.com with SMTPS id i6sor3459660wru.21.2018.11.30.04.02.01 for (Google Transport Security); Fri, 30 Nov 2018 04:02:01 -0800 (PST) MIME-Version: 1.0 References: <76c6e92b-df49-d4b5-27f7-5f2013713727@suse.cz> <8b211f35-0722-cd94-1360-a2dd9fba351e@suse.cz> <20180829150136.GA10223@dhcp22.suse.cz> <20180829152716.GB10223@dhcp22.suse.cz> <6e3a9434-32f2-0388-e0c7-2bd1c2ebc8b1@suse.cz> <20181030152632.GG32673@dhcp22.suse.cz> <98305976-612f-cf6d-1377-2f9f045710a9@suse.cz> <3173eba8-8d7a-b9d7-7d23-38e6008ce2d6@suse.cz> In-Reply-To: <3173eba8-8d7a-b9d7-7d23-38e6008ce2d6@suse.cz> From: Marinko Catovic Date: Fri, 30 Nov 2018 13:01:49 +0100 Message-ID: Subject: Re: Caching/buffers become useless after some time Content-Type: text/plain; charset="UTF-8" Sender: owner-linux-mm@kvack.org List-ID: To: Vlastimil Babka Cc: Michal Hocko , linux-mm@kvack.org, Christopher Lameter Am Fr., 2. Nov. 2018 um 15:59 Uhr schrieb Vlastimil Babka : > > Forgot to answer this: > > On 10/31/18 3:53 PM, Marinko Catovic wrote: > > Well caching of any operations with find/du is not necessary imho > > anyway, since walking over all these millions of files in that time > > period is really not worth caching at all - if there is a way you > > mentioned to limit the commands there, that would be great. > > Also I want to mention that these operations were in use with 3.x > > kernels as well, for years, with absolutely zero issues. > > Yep, something had to change at some point. Possibly the > reclaim/compaction loop. Probably not the way dentries/inodes are being > cached though. > > > 2 > drop_caches right after that is something I considered, I just had > > some bad experience with this, since I tried it around 5:00 AM in the > > first place to give it enough spare time to finish, since sync; echo 2 > >> drop_caches can take some time, hence my question about lowering the > > limits in mm/vmscan.c, void drop_slab_node(int nid) > > > > I could do this effectively right after find/du at 07:45, just hoping > > that this is finished soon enough - in one worst case it took over 2 > > hours (from 05:00 AM to 07:00 AM), since the host was busy during that > > time with find/du, never having freed enough caches to continue, hence > > Dropping caches while find/du is still running would be > counter-productive. If done after it's already finished, it shouldn't be > so disruptive. > > > my question to let it stop earlier with the modification of > > drop_slab_node ... it was just an idea, nevermind if you believe that > > it was a bad one :) > > Finding a universally "correct" threshold could easily be impossible. I > guess the proper solution would be to drop the while loop and > restructure the shrinking so that it would do a single pass through all > objects. well after a few weeks to make sure, the results seem very promising. There were no issues any more after setting up the cgroup with the limit. This workaround is anyway a good idea to prevent the nightly processed from eating up all the caching/buffers which become useless anyway in the morning, so performance got even better - although the issue is not fixed with that workaround. Since other people will be affected sooner or later as well imho, hopefully you'll figure out a fix soon. Nevertheless I also ran into a new problem there. While writing the PID into the tasks-file (echo $$ > ../tasks) or a direct fputs(getpid(), tasks_fp); works very well, I also had problems with daemons that I wanted to start (e.g. a SQL server) from within that cgroup-controlled binary. This results in the sql server's task kill, since the memory limit is exceeded. I would not like to set the memory.limit_in_bytes to something that huge, such as 30G to make sure, I'd rather just use a wrapper script to handle this, for example: 1) the cgroup-controlled instance starts the wrapper script 2) which excludes itself from the tasks-PID-list (hence the wrapper script it is not controlled any more) 3) it starts or does whatever necessary that should continue normally without the memory restriction Currently I fail to manage this, since I do not know how to do step 2. echo $PID > tasks writes into it and adds the PID, but how would one remove the wrapper script's PID from there? I came up with: cat /cgpath/A/tasks | sed "/$$/d" | cat > /cgpath/A/tasks ..which results in a list without the current PID, however, it fails to write to tasks with cat: write error: Invalid argument, since this is not a regular file.