From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DB0AC433FE for ; Fri, 25 Mar 2022 10:31:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1358878AbiCYKdH (ORCPT ); Fri, 25 Mar 2022 06:33:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1358907AbiCYKcz (ORCPT ); Fri, 25 Mar 2022 06:32:55 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 79A22CA0D5; Fri, 25 Mar 2022 03:31:21 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 315BB210DD; Fri, 25 Mar 2022 10:31:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1648204280; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=h6jomU3pNhNfJYGm9LatF9aqOCiUlgqkCJJ0ZhBuK44=; b=OsMPUua2iforFSm/UO/9WA9UJuZ+xXzZyx7U7SbDGrfPWJ19fd6uUDvHS5HlFiizH8bmb/ p2dURGMdjGTgK315FM0aQYhh88yWszyYJUdLrvzWW2gJttjV5HsA9O+13YgiZIiNGIXf9t fn7HnjdhscuOqORQ1tnVhDPzk3vTItA= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id B7E901332D; Fri, 25 Mar 2022 10:31:19 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id /5V/K/eZPWLFHgAAMHmgww (envelope-from ); Fri, 25 Mar 2022 10:31:19 +0000 Date: Fri, 25 Mar 2022 11:31:18 +0100 From: Michal =?iso-8859-1?Q?Koutn=FD?= To: Roman Gushchin Cc: cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Richard Palethorpe , Andrew Morton , Shakeel Butt , Michal Hocko , Vlastimil Babka , "Matthew Wilcox (Oracle)" , Muchun Song , Johannes Weiner , Yang Shi , Suren Baghdasaryan , Tejun Heo , Chris Down Subject: Re: [RFC PATCH] mm: memcg: Do not count memory.low reclaim if it does not happen Message-ID: <20220325103118.GC2828@blackbody.suse.cz> References: <20220324095157.GA16685@blackbody.suse.cz> <5049EBC3-5BAE-4509-BA63-1F4A7D913517@linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <5049EBC3-5BAE-4509-BA63-1F4A7D913517@linux.dev> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 24, 2022 at 11:17:14AM -0700, Roman Gushchin wrote: > Ok, so it’s not really about the implementation details of the reclaim > mechanism (I mean rounding up to the batch size etc), Actually, that was what I deemed more serious first. It's the point 2 of RFCness: | 2) The observed behavior slightly impacts distribution of parent's memory.low. | Constructed example is a passive protected workload in s1 and active in s2 | (active ~ counteracts the reclaim with allocations). It could strip | protection from s1 one by one (one:=SWAP_CLUSTER_MAX/2^sc.priority). | That may be considered both wrong (s1 should have been more protected) or | correct s2 deserves protection due to its activity. | I don't have (didn't collect) data for this, so I think just masking the | false events is sufficient (or independent). > Idk, I don’t have a strong argument against this change (except that > it changes the existing behavior), but I also don’t see why such > events are harmful. Do you mind elaborating a bit more? So I've collected some demo data now. systemd-run \ -u precious.service --slice=test-protected.slice \ -p MemoryLow=50M \ /root/memeater 50 # allocates 50M anon, doesn't use it systemd-run \ -u victim.service --slice=test-protected.slice \ -p MemoryLow=0M \ /root/memeater -m 50 50 # allocates 50M anon, uses it echo "Started workloads" systemctl set-property --runtime test.slice MemoryMax=200M systemctl set-property --runtime test-protected.slice MemoryLow=50M sleep 5 systemd-run \ -u pressure.service --slice=test.slice \ -p MemorySwapMax=0M \ # to push test-protected.slice to swap /root/memeater -m 170 170 sleep 5 systemd-cgtop -b -1 -m test.slice Result with memory_recursiveprot > Control Group Tasks %CPU Memory Input/s Output/s > test.slice 3 - 199.9M - - > test.slice/pressure.service 1 - 170.5M - - > test.slice/test-protected.slice 2 - 29.4M - - > test.slice/test-protected.slice/victim.service 1 - 29.1M - - > test.slice/test-protected.slice/precious.service 1 - 292.0K - - Result without memory_recursiveprot > Control Group Tasks %CPU Memory Input/s Output/s > test.slice 3 - 199.8M - - > test.slice/pressure.service 1 - 170.5M - - > test.slice/test-protected.slice 2 - 29.3M - - > test.slice/test-protected.slice/precious.service 1 - 28.7M - - > test.slice/test-protected.slice/victim.service 1 - 560.0K - - (kernel 5.17.0, systemd 249.10) So with this result, I'd say the event reporting is an independent change (admiteddly, thanks to the current implementation (not the proposal of mine) I noticed this issue). /me scratches head, let me review my other approaches... Michal