From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA635C4167B for ; Sat, 10 Dec 2022 08:01:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3185E8E0003; Sat, 10 Dec 2022 03:01:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2C8EF8E0001; Sat, 10 Dec 2022 03:01:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1905E8E0003; Sat, 10 Dec 2022 03:01:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 05A398E0001 for ; Sat, 10 Dec 2022 03:01:44 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id BF74280699 for ; Sat, 10 Dec 2022 08:01:43 +0000 (UTC) X-FDA: 80225652486.16.C829082 Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) by imf19.hostedemail.com (Postfix) with ESMTP id E81EB1A0010 for ; Sat, 10 Dec 2022 08:01:41 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=d4DvXaJU; spf=pass (imf19.hostedemail.com: domain of weixugc@google.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=weixugc@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670659302; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=b0n+FUhQiaxKcO571s8XzqSdYNC+k0Hbm7weCPJfOjM=; b=5K1W/HAkMTycHW+Uj4x8B3sALFDEo9GXMKeSAF61yEk3UZiOMsVpM4PlxwiQ5izcT7xKH6 7IKyTQRH53/a+uHSwCDHPOMdjLW7OC5Nv8ge9H8PyCCBwntZO3OXmoA9UKc/ZRKEF/WXHw J2mb9G2AAqshBAk71QEN57BT1QqXrrE= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=d4DvXaJU; spf=pass (imf19.hostedemail.com: domain of weixugc@google.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=weixugc@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670659302; a=rsa-sha256; cv=none; b=zYDKCBIm5P9/SKlDsV5cDXjxE/d3ERTHTwPMC/SsRyCPXP1qj5dVYo3Jv19tsLXRWfUIpN X4+LhQwpuuYr/SSNk7pzuVTxUWs6loJZ+bNMFEcZSdFCnNd82hGfaiytGAFSMAqtPrev8M 4y8ylWg9UqRIaWXPVVsC/H0r/G3wqmM= Received: by mail-pf1-f170.google.com with SMTP id d82so5247339pfd.11 for ; Sat, 10 Dec 2022 00:01:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=b0n+FUhQiaxKcO571s8XzqSdYNC+k0Hbm7weCPJfOjM=; b=d4DvXaJULzTcjMkvFIFPwPVdOLLkjsUKHRvY7B14PO565vD7N33/z9Yc7kuD6pjeKF MiYCkFIm0VGh/YXj4K0KvKE/6Gwq5ohIx1CLBXFp73iofD/FPSY7SknoJSM9acILx0FX 3JSwyyC8mIr3yv5lhq/sL9LuxVJdDo6oldg1SIKWqQ/koDmoaiQx2sC4jparIU/9/Hh9 i7DHjPnq9W9Z064iOErtbA5J72XeKlvtVy5bYENBsBkddKryamYokzPifAtLNwQUS7Vq 0N65KftCjRyUFb/dhlUUgPPCg8cjDD8NWvni0zH1RhZieP/IKrrgpFwLHVaknBWDbv2w xfVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=b0n+FUhQiaxKcO571s8XzqSdYNC+k0Hbm7weCPJfOjM=; b=3ArHRsK8hF8wDldYfBMW8CqVOKTARkC8p4Yb4GdEXlN0cLOK1Zd0W1/eJeCdNMVCXf FX5YpyyxVPyren1aFIe09xo/afu44oMAz5LgecjGZ1Taqel3YHoBrlnr4v7bbeuybDlT Ff51saTx7Br69VxW4e9meZTHg81eQkefACZpGJ1IaoLFQkZYAsQQZp+agikBVNcPziog zpPc7yL+byT5IuZ3d+tXXdxbautikNU65BACgdLviuYdq49LQlHPHN1su93BPo83D4yU 3HJrJMSf7i4wakwldQve2Gqo3IKe9TdPwXIUJfvXupEBEFfvaWHNdCpw2C2g5E2aSfK/ Xt3Q== X-Gm-Message-State: ANoB5pnX8W65YTBda6FTlcVdO2CnRfKuoyhM0VwykRGsHeJYKCWr0Vck 9TS82b7J/2vUJ9wJIfEnZFCvJiUuZdqvhP87fAZ+nA== X-Google-Smtp-Source: AA0mqf6jWqW/YsdFHA9xbvBaztEOl3NU/SQG1lBBQ9K42LuNTbPk5rxq0Q/B5mvwKITYs/+9S5OVcSet3h0XpgQHS20= X-Received: by 2002:a63:1c22:0:b0:478:c543:89d3 with SMTP id c34-20020a631c22000000b00478c54389d3mr14455701pgc.184.1670659300337; Sat, 10 Dec 2022 00:01:40 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Wei Xu Date: Sat, 10 Dec 2022 00:01:28 -0800 Message-ID: Subject: Re: [PATCH v3] [mm-unstable] mm: Fix memcg reclaim on memory tiered systems To: Michal Hocko Cc: Mina Almasry , Andrew Morton , Johannes Weiner , Roman Gushchin , Shakeel Butt , Muchun Song , Huang Ying , Yang Shi , Yosry Ahmed , fvdl@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: E81EB1A0010 X-Stat-Signature: 1eaag95okybjdhwxy5kjeo7g3zbh1myc X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1670659301-979990 X-HE-Meta: U2FsdGVkX1/HDx8cJuWzY8lUtQgGm2L1DqAZDuLk6yVA3QO08FrqXCgkVH97ZY4cbdX9DG54JkMXH2uJ6+crtPY5iIe2b2f88JYvhBEBk4fJFnbStd1KwDlAvqy4HWbGy2FVd6zPT1si81dueqUUE1UtTxXOqzLegstU+Z+gHVw1ZeU0PBXS4LLbFwygCe7ztw3JIttMIevzP5ra5klDyrruLcICn5WXdLdlPHyUlwAYgqLCEPmVE/4OGoPusYQAk9iBxoB65y1kV59e0ect/5BAlb3ug/8swsxPvijwmLpK1irEXn2I0KsOk1KrKGhkdn3PM4EWwyX6G0Ad9e3Q9wh9xXWIYc+/RCU7tJNFbua+kytmXVRcZMMD1zpeXjOLCW+ewfS8fDpj8jUivVYYHLJSCpQhLjXXbONgQt57IW3rt0dUCYUoCdv/z/cmy3VfAFEUOD5l+sInALcigFYJatt8Vf37bOb6Rk9j5IU8gfLVQVgroO5uD40Bmmjaa8Am4e2hmVyuUxzcfLhBWhvRIZEUnmGdFEyy0I2isLtSN5/L6nMWbBes7yPukODU9sS6LHcz1p0hYXsHQd9HX0C5XS76lA0QflG/ldN3bas3wbq8hSTohClMVB9/gS8p5RPPLFrR8YyOt0wijkcN5dAdHPwa/siupSaTG4KFu4ls9YRrEZIzkbP43ujSwojHh2Gk/TysrjeCIj756dY4eeYIa2sfYIYtjR606xYDGISN71pvNPcPZFSs3mX4Ep4stJ8icg7F08V0vqhzv2aRaHImbf6W68L19J8ocCoKdOR38WQIEIXEqReEPTLDIEu1izqWZ4MpbKOWdadqieV1jmKa76PBwrtq5ywewZCJ52Llb2KSSG7tYrO/T4L0M0CkB5DskhFIF/MPtPLO4q97ykL2FPXLZbd3siRfBXsIHKNqp3VcpH3OJZ8BY13AOdSsihalUuxwkVmUutTQ0lJH/E5 sF5Uq9Yg I2sj05jDDMcufx38= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Dec 9, 2022 at 1:16 PM Michal Hocko wrote: > > On Fri 09-12-22 08:41:47, Wei Xu wrote: > > On Fri, Dec 9, 2022 at 12:08 AM Michal Hocko wrote: > > > > > > On Thu 08-12-22 16:59:36, Wei Xu wrote: > > > [...] > > > > > What I really mean is to add demotion nodes to the nodemask along with > > > > > the set of nodes you want to reclaim from. To me that sounds like a > > > > > more natural interface allowing for all sorts of usecases: > > > > > - free up demotion targets (only specify demotion nodes in the mask) > > > > > - control where to demote (e.g. select specific demotion target(s)) > > > > > - do not demote at all (skip demotion nodes from the node mask) > > > > > > > > For clarification, do you mean to add another argument (e.g. > > > > demotion_nodes) in addition to the "nodes" argument? > > > > > > No, nodes=mask argument should control the domain where the memory > > > reclaim should happen. That includes both aging and the reclaim. If the > > > mask doesn't contain any lower tier node then no demotion will happen. > > > If only a subset of lower tiers are specified then only those could be > > > used for the demotion process. Or put it otherwise, the nodemask is not > > > only used to filter out zonelists during reclaim it also restricts > > > migration targets. > > > > > > Is this more clear now? > > > > In that case, how can we request demotion only from toptier nodes > > (without counting any reclaimed bytes from other nodes), which is our > > memory tiering use case? > > I am not sure I follow. Could you be more specific please? In our memory tiering use case, we would like to proactively free up memory on top-tier nodes by demoting cold pages to lower-tier nodes. This is to create enough free top-tier memory for new allocations and promotions. How many pages and how often to demote from top-tier nodes can depend on a number of factors (e.g. the amount of free top-tier memory, the amount of cold pages, the bandwidth pressure on lower-tier, the task tolerance of slower memory on performance) and are controlled by the userspace policies. Because the purpose of such proactive demotions is to free up top-tier memory, not to lower the amount of memory charged to the memcg, we'd like that memory.reclaim can demote the specified amount of bytes from the given top-tier nodes. If we have to also provide the lower-tier nodes to memory.reclaim to allow demotions, the kernel can reclaim from the lower-tier nodes in the same memory.reclaim request. We then won't be able to control the amount of bytes to be demoted from top-tier nodes. > > Besides, when both toptier and demotion nodes are specified, the > > demoted pages should only be counted as aging and not be counted > > towards the requested bytes of try_to_free_mem_cgroup_pages(), which > > is what this patch tries to address. > > This should be addressed by > http://lkml.kernel.org/r/Y5B1K5zAE0PkjFZx@dhcp22.suse.cz, no? > -- > Michal Hocko > SUSE Labs