From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46845C10F14 for ; Thu, 10 Oct 2019 08:41:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 028E822478 for ; Thu, 10 Oct 2019 08:41:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="JjSEq/3L" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 028E822478 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A87AA8E0007; Thu, 10 Oct 2019 04:41:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A37968E0003; Thu, 10 Oct 2019 04:41:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 94E558E0007; Thu, 10 Oct 2019 04:41:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0100.hostedemail.com [216.40.44.100]) by kanga.kvack.org (Postfix) with ESMTP id 73CAD8E0003 for ; Thu, 10 Oct 2019 04:41:07 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 129A0180AD808 for ; Thu, 10 Oct 2019 08:41:07 +0000 (UTC) X-FDA: 76027230174.24.rule48_7e15bb44cf41e X-HE-Tag: rule48_7e15bb44cf41e X-Filterd-Recvd-Size: 6890 Received: from userp2120.oracle.com (userp2120.oracle.com [156.151.31.85]) by imf34.hostedemail.com (Postfix) with ESMTP for ; Thu, 10 Oct 2019 08:41:06 +0000 (UTC) Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x9A8T6VF142658; Thu, 10 Oct 2019 08:41:00 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2019-08-05; bh=tUYE+M/7qrLxnbGTJbtksLFzI1+t5ViPhEGBG2YgARQ=; b=JjSEq/3LUHFfYs4OuGxFoRnGuHrIJ9ESt79hfvG+qjUHlvCwXozL257Qp4We/r9IxxRa Hig5rIOza5E69p1iL0mDl+lb7zJgQCFnT5e71xIAfH3cLFh/gs6A3DPcScGrDZp5KSL/ bIqPRsIK6YtzizNEhfWBHzSlzqiq76Xhy4q9ijKpBlbjjVhzxUuG2Vq/dSPgLcDWFjuj KgOKg0PWaJr9MW8a1aLhh5nOBvkYeKOgKPCPivBv6+BD5Eg3dVjr0PQrByqMcV08b3HJ QXHTQBT+x2RbxbUD/8PPD4ul7wihrG73WnwqrZ9uyX/nX5XNncWRMC0pqI8lO5USPUVk ww== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by userp2120.oracle.com with ESMTP id 2vektrsev9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 10 Oct 2019 08:40:59 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x9A8S9tj012687; Thu, 10 Oct 2019 08:40:59 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userp3030.oracle.com with ESMTP id 2vhrxd7m7w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 10 Oct 2019 08:40:59 +0000 Received: from abhmp0007.oracle.com (abhmp0007.oracle.com [141.146.116.13]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x9A8et3r009273; Thu, 10 Oct 2019 08:40:55 GMT Received: from [10.182.69.197] (/10.182.69.197) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Thu, 10 Oct 2019 01:40:54 -0700 Subject: Re: [PATCH v2] mm/vmscan: get number of pages on the LRU list in memcgroup base on lru_zone_size To: Michal Hocko Cc: linux-mm@kvack.org, vdavydov.dev@gmail.com, hannes@cmpxchg.org References: <20190905071034.16822-1-honglei.wang@oracle.com> <20191007142805.GM2381@dhcp22.suse.cz> <991b4719-a2a0-9efe-de02-56a928752fe3@oracle.com> <20191009141644.GD6681@dhcp22.suse.cz> From: Honglei Wang Message-ID: <4dccae1b-2b34-7ff9-94c3-8814baab636e@oracle.com> Date: Thu, 10 Oct 2019 16:40:52 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <20191009141644.GD6681@dhcp22.suse.cz> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9405 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1908290000 definitions=main-1910100080 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9405 signatures=668684 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1908290000 definitions=main-1910100080 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 10/9/19 10:16 PM, Michal Hocko wrote: > On Tue 08-10-19 17:34:03, Honglei Wang wrote: >> How about we describe it like this: >> >> Get the lru_size base on lru_zone_size of mem_cgroup_per_node which is not >> updated via batching can help any related code path get more precise lru >> size in mem_cgroup case. This makes memory reclaim code won't ignore small >> blocks of memory(say, less than MEMCG_CHARGE_BATCH pages) in the lru list. > > I am sorry but this doesn't really explain the problem nor justify the > patch. > Let's have a look at where we are at first. lruvec_lru_size provides an > estimate of the number of pages on the given lru that qualifies for the > given zone index. Note the estimate part because that is an optimization > for the updaters path which tend to be really hot. Here we are > consistent between the global and memcg cases. > > Now we can have a look at differences between the two cases. The global > LRU case relies on periodic syncing from a kworker context. This has no > guarantee on the timing and as such we cannot really rely on it to be > precise. Memcg path batches updates to MEMCG_CHARGE_BATCH (32) pages > and propages the value up the hierarchy. There is no periodic sync up so > the unsynced case might stay for ever if there are no new accounting events > happening. > > Now, does it really matter? 32 pages should be really negligible to > normal workloads (read to those where MEMCG_CHARGE_BATCH << limits). > So we can talk whether other usecases are really sensible. Do we really > want to support memcgs with hard limit set to 10 pages? I would say I am > not really convinced because I have hard time to see real application > other than some artificial testing. On the other hand there is really > non trivial effort to make such usecases to work - just consider all > potential caching/batching that we do for performance reasons. > Thanks for the detailed explanation, Michal. Yes, I didn't care about such kind of testing until QA guys got me and said the ltp testcase don't work as expect and same test passed in older kernel. I recognize there are some users whose job is doing functional verification on Linux. It might make them confused that same test case fail on latest kernel. And they don't know kernel internal such as the details of batch accounting. They just want to use several pages memory to verify the usage of memory feature and there is no 32 pages limitation mentioned in any documentations... I explain stuff of batch accounting and MEMCG_CHARGE_BATCH to QA mate and clarify it's not a kernel bug. But on the other hand, the question is, is it necessary for us to cater to these users? > That being said, making lruvec_lru_size more precise doesn't sound like > a bad idea in general. But it comes with an additional cost which > shouldn't really matter much with the current code because it shouldn't > be used from hot paths. But is this really the case? Have you done all > the audit? Is this going to stay that way? These are important questions > to answer in the changelog to justify the change properly. > > I hope this makes more sense now. > Yes, I'll think more about these questions. Thanks, Honglei