From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43A76C32753 for ; Wed, 14 Aug 2019 08:58:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 152F6205F4 for ; Wed, 14 Aug 2019 08:58:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1565773119; bh=lR6D/uZC5vwtdQApWaGMLJNcYlISGU/5wQ1Oz3L94P4=; h=Date:From:To:Cc:Subject:References:In-Reply-To:List-ID:From; b=HrrHRt84aJoLCIA3XmGs8wmyzG4TFp9zM67otmsnvRhit0goO8rlWAWglZvW1w3Rq mwPXb/gl5yO8t2ZKJJGLugBzc2BOldX9UlQmsaojJm34uFa1DoJ4r/LNp66IP6goY1 5+XhCs4grFMrtL2ihI91bd3mhQY+8oBdvvoz0Jxo= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726488AbfHNI6h (ORCPT ); Wed, 14 Aug 2019 04:58:37 -0400 Received: from mx2.suse.de ([195.135.220.15]:50750 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725828AbfHNI6h (ORCPT ); Wed, 14 Aug 2019 04:58:37 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 325FAAD4E; Wed, 14 Aug 2019 08:58:33 +0000 (UTC) Date: Wed, 14 Aug 2019 10:58:31 +0200 From: Michal Hocko To: Khalid Aziz Cc: akpm@linux-foundation.org, vbabka@suse.cz, mgorman@techsingularity.net, dan.j.williams@intel.com, osalvador@suse.de, richard.weiyang@gmail.com, hannes@cmpxchg.org, arunks@codeaurora.org, rppt@linux.vnet.ibm.com, jgg@ziepe.ca, amir73il@gmail.com, alexander.h.duyck@linux.intel.com, linux-mm@kvack.org, linux-kernel-mentees@lists.linuxfoundation.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH 0/2] Add predictive memory reclamation and compaction Message-ID: <20190814085831.GS17933@dhcp22.suse.cz> References: <20190813014012.30232-1-khalid.aziz@oracle.com> <20190813140553.GK17933@dhcp22.suse.cz> <3cb0af00-f091-2f3e-d6cc-73a5171e6eda@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3cb0af00-f091-2f3e-d6cc-73a5171e6eda@oracle.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 13-08-19 09:20:51, Khalid Aziz wrote: > On 8/13/19 8:05 AM, Michal Hocko wrote: > > On Mon 12-08-19 19:40:10, Khalid Aziz wrote: > > [...] > >> Patch 1 adds code to maintain a sliding lookback window of (time, number > >> of free pages) points which can be updated continuously and adds code to > >> compute best fit line across these points. It also adds code to use the > >> best fit lines to determine if kernel must start reclamation or > >> compaction. > >> > >> Patch 2 adds code to collect data points on free pages of various orders > >> at different points in time, uses code in patch 1 to update sliding > >> lookback window with these points and kicks off reclamation or > >> compaction based upon the results it gets. > > > > An important piece of information missing in your description is why > > do we need to keep that logic in the kernel. In other words, we have > > the background reclaim that acts on a wmark range and those are tunable > > from the userspace. The primary point of this background reclaim is to > > keep balance and prevent from direct reclaim. Why cannot you implement > > this or any other dynamic trend watching watchdog and tune watermarks > > accordingly? Something similar applies to kcompactd although we might be > > lacking a good interface. > > > > Hi Michal, > > That is a very good question. As a matter of fact the initial prototype > to assess the feasibility of this approach was written in userspace for > a very limited application. We wrote the initial prototype to monitor > fragmentation and used /sys/devices/system/node/node*/compact to trigger > compaction. The prototype demonstrated this approach has merits. > > The primary reason to implement this logic in the kernel is to make the > kernel self-tuning. What makes this particular self-tuning an universal win? In other words there are many ways to analyze the memory pressure and feedback it back that I can think of. It is quite likely that very specific workloads would have very specific demands there. I have seen cases where are trivial increase of min_free_kbytes to normally insane value worked really great for a DB workload because the wasted memory didn't matter for example. > The more knobs we have externally, the more complex > it becomes to tune the kernel externally. I agree on this point. Is the current set of tunning sufficient? What would be missing if not? -- Michal Hocko SUSE Labs