From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 054E6C2BA83 for ; Thu, 13 Feb 2020 15:46:38 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C738620675 for ; Thu, 13 Feb 2020 15:46:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C738620675 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5D2516B055D; Thu, 13 Feb 2020 10:46:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5839F6B055F; Thu, 13 Feb 2020 10:46:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4727F6B0560; Thu, 13 Feb 2020 10:46:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0025.hostedemail.com [216.40.44.25]) by kanga.kvack.org (Postfix) with ESMTP id 30A906B055D for ; Thu, 13 Feb 2020 10:46:37 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id CD689180AD80F for ; Thu, 13 Feb 2020 15:46:36 +0000 (UTC) X-FDA: 76485531192.12.sleet81_73dbd93a9b613 X-HE-Tag: sleet81_73dbd93a9b613 X-Filterd-Recvd-Size: 7065 Received: from mail-wm1-f66.google.com (mail-wm1-f66.google.com [209.85.128.66]) by imf23.hostedemail.com (Postfix) with ESMTP for ; Thu, 13 Feb 2020 15:46:36 +0000 (UTC) Received: by mail-wm1-f66.google.com with SMTP id s144so20562wme.1 for ; Thu, 13 Feb 2020 07:46:36 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=J8k2mhXdk3Vz02LszhRoEidbhtCkluKhM0hovqHlfY0=; b=ERlLsFngredN439RocKWTvUdnTXT2opCk4SSL0bf+/sMC2TSBi1Aj2dJeT7wR566iT RcGCPh0n/cGhLe87gpEGpMba6i6ocCH4L7XRKxaEh9hz+9V7RuKGvmYYej56zm3saTUB 3NdSnePayP1wRNMgUlNDPrbnYiOYm3862eFMTeSBeFwGuw6o0mHyvtp9yw7tpo3Pd98l z2qr2WdJIWUJLaTDs9YSCXXOUg0TzpSUNioHPQ6J46boGtGF71nFFgcw+GPCMrq1IwJz G1joX6KHAjS2G2eoDxxP5ymgZ0/Nu8NUZc0k/3I+nB6YEbeFpMW4nfK2z21lIDHX4VLa nCng== X-Gm-Message-State: APjAAAWHtAK+y9s+wvRPD7AwagFW9vt49LbRUZWENdqDkojH6XeYMfVV RPe+u6WuB7/58C174OffRQ0= X-Google-Smtp-Source: APXvYqxCRc7R8lbPp7soi6Tf8uhzJML2oddqjNP5vBPCY+sYO6/UygTQhV+vkx/KmTbiGRrHuw62Vg== X-Received: by 2002:a1c:7205:: with SMTP id n5mr6747383wmc.9.1581608794910; Thu, 13 Feb 2020 07:46:34 -0800 (PST) Received: from localhost (ip-37-188-133-87.eurotel.cz. [37.188.133.87]) by smtp.gmail.com with ESMTPSA id h71sm3825363wme.26.2020.02.13.07.46.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Feb 2020 07:46:33 -0800 (PST) Date: Thu, 13 Feb 2020 16:46:27 +0100 From: Michal Hocko To: Johannes Weiner Cc: Andrew Morton , Roman Gushchin , Tejun Heo , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH v2 3/3] mm: memcontrol: recursive memory.low protection Message-ID: <20200213154627.GD31689@dhcp22.suse.cz> References: <20191219200718.15696-1-hannes@cmpxchg.org> <20191219200718.15696-4-hannes@cmpxchg.org> <20200130170020.GZ24244@dhcp22.suse.cz> <20200203215201.GD6380@cmpxchg.org> <20200211164753.GQ10636@dhcp22.suse.cz> <20200212170826.GC180867@cmpxchg.org> <20200213074049.GA31689@dhcp22.suse.cz> <20200213132317.GA208501@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200213132317.GA208501@cmpxchg.org> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu 13-02-20 08:23:17, Johannes Weiner wrote: > On Thu, Feb 13, 2020 at 08:40:49AM +0100, Michal Hocko wrote: > > On Wed 12-02-20 12:08:26, Johannes Weiner wrote: > > > On Tue, Feb 11, 2020 at 05:47:53PM +0100, Michal Hocko wrote: > > > > Unless I am missing something then I am afraid it doesn't. Say you have a > > > > default systemd cgroup deployment (aka deeper cgroup hierarchy with > > > > slices and scopes) and now you want to grant a reclaim protection on a > > > > leaf cgroup (or even a whole slice that is not really important). All the > > > > hierarchy up the tree has the protection set to 0 by default, right? You > > > > simply cannot get that protection. You would need to configure the > > > > protection up the hierarchy and that is really cumbersome. > > > > > > Okay, I think I know what you mean. Let's say you have a tree like > > > this: > > > > > > A > > > / \ > > > B1 B2 > > > / \ \ > > > C1 C2 C3 > > > > > > and there is no actual delegation point - everything belongs to the > > > same user / trust domain. C1 sets memory.low to 10G, but its parents > > > set nothing. You're saying we should honor the 10G protection during > > > global and limit reclaims anywhere in the tree? > > > > No, only in the C1 which sets the limit, because that is the woriking > > set we want to protect. > > > > > Now let's consider there is a delegation point at B1: we set up and > > > trust B1, but not its children. What effect would the C1 protection > > > have then? Would we ignore it during global and A reclaim, but honor > > > it when there is B1 limit reclaim? > > > > In the scheme with the inherited protection it would act as the gate > > and require an explicit low limit setup defaulting to 0 if none is > > specified. > > > > > Doing an explicit downward propagation from the root to C1 *could* be > > > tedious, but I can't think of a scenario where it's completely > > > impossible. Especially because we allow proportional distribution when > > > the limit is overcommitted and you don't have to be 100% accurate. > > > > So let's see how that works in practice, say a multi workload setup > > with a complex/deep cgroup hierachies (e.g. your above example). No > > delegation point this time. > > > > C1 asks for low=1G while using 500M, C3 low=100M using 80M. B1 and > > B2 are completely independent workloads and the same applies to C2 which > > doesn't ask for any protection at all? C2 uses 100M. Now the admin has > > to propagate protection upwards so B1 low=1G, B2 low=100M and A low=1G, > > right? Let's say we have a global reclaim due to external pressure that > > originates from outside of A hierarchy (it is not overcommited on the > > protection). > > > > Unless I miss something C2 would get a protection even though nobody > > asked for it. > > Good observation, but I think you spotted an unintentional side effect > of how I implemented the "floating protection" calculation rather than > a design problem. > > My patch still allows explicit downward propagation. So if B1 sets up > 1G, and C1 explicitly claims those 1G (low>=1G, usage>=1G), C2 does > NOT get any protection. There is no "floating" protection left in B1 > that could get to C2. Yeah, the saturated protection works reasonably AFAICS. > However, to calculate the float, I'm using the utilized protection > counters (children_low_usage) to determine what is "claimed". Mostly > for convenience because they were already there. In your example, C1 > is only utilizing 500M of its protection, leaving 500M in the float > that will go toward C2. I agree that's undesirable. > > But it's fixable by adding a hierarchical children_low counter that > tracks the static configuration, and using that to calculate floating > protection instead of the dynamic children_low_usage. > > That way you can propagate protection from A to C1 without it spilling > to anybody else unintentionally, regardless of how much B1 and C1 are > actually *using*. > > Does that sound reasonable? Please post a patch and I will think about it more to see whether I can see more problems. I am worried this is getting more and more complex and harder to wrap head around. Thanks! -- Michal Hocko SUSE Labs