From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933425AbeAKV7x (ORCPT + 1 other); Thu, 11 Jan 2018 16:59:53 -0500 Received: from mail-eopbgr00136.outbound.protection.outlook.com ([40.107.0.136]:18432 "EHLO EUR02-AM5-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S932790AbeAKV7u (ORCPT ); Thu, 11 Jan 2018 16:59:50 -0500 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=aryabinin@virtuozzo.com; Subject: Re: [PATCH v4] mm/memcg: try harder to decrease [memory,memsw].limit_in_bytes To: Michal Hocko Cc: Andrew Morton , Johannes Weiner , Vladimir Davydov , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Shakeel Butt References: <20180109152622.31ca558acb0cc25a1b14f38c@linux-foundation.org> <20180110124317.28887-1-aryabinin@virtuozzo.com> <20180111104239.GZ1732@dhcp22.suse.cz> <4a8f667d-c2ae-e3df-00fd-edc01afe19e1@virtuozzo.com> <20180111124629.GA1732@dhcp22.suse.cz> <20180111162947.GG1732@dhcp22.suse.cz> From: Andrey Ryabinin Message-ID: <560a77b5-02d7-cbae-35f3-0b20a1c384c2@virtuozzo.com> Date: Fri, 12 Jan 2018 00:59:38 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: <20180111162947.GG1732@dhcp22.suse.cz> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [84.47.128.244] X-ClientProxiedBy: DB6PR0402CA0016.eurprd04.prod.outlook.com (2603:10a6:4:91::26) To AM4PR08MB2820.eurprd08.prod.outlook.com (2603:10a6:205:d::26) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 1e7c8641-999f-438a-175f-08d5593ea166 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(7020072)(4652020)(5600026)(4604075)(2017052603307)(7153060)(7193020);SRVR:AM4PR08MB2820; X-Microsoft-Exchange-Diagnostics: 1;AM4PR08MB2820;3:Wo24SwWs0p7857tnhv4wamwuMOqYaVUj46WSi64lJH3P8B6k/tjRUSkf7dj2w7QnIjTlH1jZhC2pFUDv6UNc3NH9r/rQBf6DnDMdaPWt/+06uCzQPJFZwQtIA3VaIEKSefCXtIfNpf4GubVohSw9GjBK5rrtTtOkrp8075bNNrvrfBrCVg1x2IY/qvt/IsKvOecNzHGO29HGZGMhRgzBv+mhPpB4uQeLF/iJQ1Yz1JdAWtLVJBkalZMJiPtE55Ga;25:A1x3vHiYPwaQTq6nOOjXENxz/+6sB0efJutCJN2eeFbOhCn2c5b1d6J4zBhQlttxbrQyn4UcZ+3hJ/TP6Q3MCP3aSoqTyH028yAYYP7xl+iyyi/pEz8fddn2lmywR+yIEwgXM4HVThl6PvI7gI89hdPEkQSz32w5PJUK+iaaJFUP9oskOgFoBn0XtP6RrlYRFqC42DY0zOnOw4+NMzxtr96p8b+po+b4qz4q9F1m/ZL0rHN44KjwPNvEiEqy5LLypuJGMpl16ZL6x+Kph2dF4x6IPttwpvnlYN10AsUnBjGrVvEYbF0NQDtrXg+5vS+ptPDb36ybL4cwv+oNhYKT9w==;31:UxYEd8vexO141M3dgNMFuyf9k548xe8NT0PnRESoL8u52PqaXtgubZzLIdlNlwe9C+6kd6ma9YryJadv6DXwWv8zNdIbhykaeO6xZXMv7KUfVkNwrqX+NW+EclCzDrNPGkgjUV8H8o6BWeCdPGm6uXLfwxPrvOtnoO45U1T/EU1hVu4g3hs5pSGni/usPr9Ap6rd4c9/qkn5LsEGhDM1vck3wNShat58gu4iVwMFqCQ= X-MS-TrafficTypeDiagnostic: AM4PR08MB2820: X-Microsoft-Exchange-Diagnostics: 1;AM4PR08MB2820;20:F+z+cSnraQbGb1TZw3dTLybE4JmeTus4DSpPNCrvkSiIYIRJhUrENie9VkiYSNq+6Ht7IkT60zPfusoekRNIwolDDuxmxr76LTwYcOoM823A26KXH2iuzPFlDgxRr3ZQpvRyKoAkcrk80DOyKM/Pbj4tkpF9FSv/ga2YB8PE0vyEe92RvVcuYKTVkZlSK3dNR0r+NgKL0UmP6kqTIQx/Gf0hOuhfEvcaGthjBfn1abbWGDtWACt/IknORq9NEOi7vyJ9oQOlJZplg7CtJ4gSHS2c5Wsej7awPDup3ZIYkokaoD94lqrCt/hwxpvYuSR2DlzU1L1qAgzwsfc66B5JjNkTmO2cXHA3l9iiQJHNPEPpdVVsQ+37lU5Yrn72yaDLp0TDcKx54E0nCroC/nziXxzL2Um2GrqiF+ioNTEuVSs=;4:Yhl+mtj18H2XrcwwZxJlE2UPgP8jbCKmaja7XYqApYKy1oreDsEpB5aA8DOIK+8Y7aEh1e5gdnucaPuNJMQZQNGesBRSOu8PWRdNHy8IaCDDVQal00K6q+yqQbfiMWYWV3/l6zb4NDe95Rt9tEg0w6BVG4KUAz3aY5BDvLGR8YYNa+d59rg1LTk5ehbPGRanq7nAyFeXK8YIXWy0wLrq9B/XE4QDTQ+fx/pVX6gsTbPFp1ZWWLQFmcHkcl0kFqCeKIIlOYXfc1CJffuLO1KuXg== X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040470)(2401047)(8121501046)(5005006)(3231023)(944501136)(10201501046)(3002001)(93006095)(93001095)(6041268)(20161123562045)(20161123558120)(20161123564045)(20161123560045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(6072148)(201708071742011);SRVR:AM4PR08MB2820;BCL:0;PCL:0;RULEID:(100000803101)(100110400095);SRVR:AM4PR08MB2820; X-Forefront-PRVS: 0549E6FD50 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(6049001)(346002)(39850400004)(396003)(366004)(39380400002)(376002)(377424004)(55674003)(52314003)(24454002)(199004)(189003)(36756003)(77096006)(47776003)(68736007)(52146003)(8676002)(7736002)(23676004)(5660300001)(52116002)(2486003)(386003)(66066001)(59450400001)(305945005)(97736004)(53936002)(31686004)(6486002)(86362001)(6246003)(65806001)(81156014)(25786009)(65956001)(229853002)(53546011)(76176011)(4326008)(3846002)(81166006)(65826007)(6116002)(31696002)(316002)(39060400002)(93886005)(106356001)(105586002)(58126008)(478600001)(2906002)(64126003)(83506002)(230700001)(16526018)(2950100002)(50466002)(6916009)(54906003)(16576012)(6666003)(8936002)(34023003)(148693002)(142933001);DIR:OUT;SFP:1102;SCL:1;SRVR:AM4PR08MB2820;H:[10.52.82.104];FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtBTTRQUjA4TUIyODIwOzIzOmZ0bDY1MnJwUllNZzJmTWE5OHkzejJWY05i?= =?utf-8?B?YisvQkZ2SkFKVGVJU3M3UWh1NlowUkRscXhEUS96QkNxc2RBUGVxc3JRRFNJ?= =?utf-8?B?bXlaTW4xSVIwVFU1c2lTSnJIRFRSRWRSRWNwbWgwWThwcHA3ZDdxak1uQVU0?= =?utf-8?B?UThHZFQ2ZkhZaUxnV1M0aVBUTnBnZnRVMFREQ2EyV3I0M2JYUHhtandKVGNx?= =?utf-8?B?RmVwOUhhalFTMXRmVWhjNkpaV1QyKzFUaEx4K1Y5Z1g0MHd6OE1wR3Q2cXRW?= =?utf-8?B?aytndnp0cU95Vmpvd0piUVkwT015VnRTM05aVnRWdnJnYXVibkQvZXIvOGYv?= =?utf-8?B?UFExL3dEK1ppVUxwN04xRXV3K1RYVVkyMnlna0hoc2dGaXJDZVFnWXE1T0Mz?= =?utf-8?B?TUVzTzJIM2p0TDAwVXQxM2VEZlVadkFOSjhYWUErTW5mTnFscGt4ZWV2bmI2?= =?utf-8?B?NzF0eHgyM20vdXA1bi9XYnVpbHV4M3hDbHFUYmJtTDhvSUZVTmJVQThUb3I1?= =?utf-8?B?Vzc1ZCtqNEg0dGlTWmkvSHdzQkY4aEdmdHFLTXJmNDhrVitncUgrWld5ZnM2?= =?utf-8?B?NHdFSUtwTW9VanBGR3lQbTVIL0FENVNKNXAvOGU1UTBxTTlyc2ZFSnVMNGdJ?= =?utf-8?B?WGowME4vS2hTQ28yeW9hV3lTL2xNSEg5Q1BhcmpwT09XRFVBdWdxMkJnR2hZ?= =?utf-8?B?eVVKMVRqZm5kVVJTaVFRWldtL25CYlNHU0ZRZHNCNlA3TTlQY0pXZk5veWNz?= =?utf-8?B?Ym9FbVZwK0RmbDIxNkVCTlIrbnFGV05rRi9vLzNxS3FSSXg1MU5WNDFlR2RE?= =?utf-8?B?UUQ1WVg1WWVIcHdpSVQwdDBmME9BY2FiV1BIczNmS0dRWlByK0FQYVFlTkd3?= =?utf-8?B?RnlrMkxBelEzQVByNHVvMXg2ZTVOazQxeUZPTDF1UFYyM1UrcGZwOGE3TmVh?= =?utf-8?B?by95Q3d3clpsZG5SSE1NMHJYOUxjQXhwc2l0WDJ2R2RmRjZiWjJNTUJvbE5X?= =?utf-8?B?TFZleXc4cCtYK1c5QjgvOTRWZFJ5Q2ZXdGRGRGNsUHBpSk9zK2NGK000YkRi?= =?utf-8?B?Y01NZXI4VzU2VytHN1FOS1o4M3d6c1N4Z2ZkRkpDbjNma1h5TU1sZnNIajUw?= =?utf-8?B?cStzR0dXTDZROEUvc1M5bDJoUnZEZzVpbksyUmlsSDRIOUc1OXVQWldCcEVj?= =?utf-8?B?RkRoNEZBKy9RQU9nbzJlVVpVdVRGWkRxNUZrdm50OXNTd1h6bFMzV0k2QVJq?= =?utf-8?B?VFlwU3h3UXp1bm1OQlFnLzFTeFRtL3lRNWphaDk5Z2syM1FLN2ZqUmgreTVQ?= =?utf-8?B?R1BEZFVXUEpnMEN5RkFFZURnYm85LzV4ZDRPOHBPTTE0Q3pBRVZjczZNK2g5?= =?utf-8?B?K1JPNTBvU0x2NFEwTFFCRTdRVUk4YjRmMG01VGNlZDRjRFVVWlNYeGxoalM5?= =?utf-8?B?Mkl4bndSVy81OEkxZmRINklOdjBSQllzUkhXdkEzZ1c5S0FFK2RZaFFhN1hk?= =?utf-8?B?MlVNYkpPTUVPQ0lvL2FaQ3VzaDJCdVNTYUFSdmxFUnBHWnhGaU96cEFiL252?= =?utf-8?B?bmJRSW8yeEJHMkZjN3ltdHhlMzZ1VXpleDBmRGJvYXFoS2lOT1oyQVJtMWNM?= =?utf-8?B?d0pqd3h1emQ1aTYvQVphalhqSW5iNDZqMHVTeUFlTHJQSFI1UVRHRXVWcFdn?= =?utf-8?B?em5YUkx4emFmZzVKSUt1R2lTNGRwcVNSUmlyd0JSRUZ4MFdsTzFheDc5NUNN?= =?utf-8?B?d0xLeldTYXhTVzcrS0dyMVRkM1lLNGptTWxBZGhVYTMwd3c2M09CamU5WE51?= =?utf-8?B?QjAzODFEb0RYUGJybG1NWVZlQWhOeGlqdmNNYlJTOGw5UnJQTDlYTHoweWYv?= =?utf-8?B?Y3c0R0RaZzlraUQvNVFJdGFEdDRpOTZiRWpFaHhRUG9WOHVYUHNnVGd1MmZQ?= =?utf-8?B?MzlRRW1aZHhYRmpwdENKWDdmY1B0ZHJBOE9pSXVXVjJHRzNqRUpjT0VEVXdn?= =?utf-8?B?NkRnUXBvS2c5OElkSEZJZ2dpSFlHcExPZkZMeGJ3cmxzaTVkRDhqT2FhUmlN?= =?utf-8?B?WWFQVWJzL1dSNGZUQzZsTEZ1aWs1L2VSMGVBMTBvdHF6cllQY2tmcWJ2Rm1K?= =?utf-8?Q?KxHltlupGm5pVKGb/P8vEfg=3D?= X-Microsoft-Exchange-Diagnostics: 1;AM4PR08MB2820;6:2j4ckNRbKi1w/HKlPpwTolh8dbD41eYVWp2Loxoj2sqrTgRINWwI98oGWnXq+/V2CC9IVZa1+lEUbfhNTmYS4KSdWIQomL5CzuWqL9mIBeMciXRj6MW6wF8/JV6IwIsrW3jfgE8ytPsP0gRQeMqaznkY0RXLi8tVjhnYRO8LwT9hzOK2MzlRg1uDGqZPmE97+TPg3iKhaqi+mPRkgWWxuF8YPNzxn8ZRTBbcrnWonIDRM6MJbD0S+SrpEzudMGZTN5IPwJ7e/nqInbpLRaxo+IDgmWWeTBM+Idz+jIBp91j7DCJkyDGXlW1rw6ZX/NKAoU5axXii+Y+QzheSDyoYinpU6zu1XbYIbHOfyZI/XUU=;5:Be9YuTOgwaLyXgATmrIOxrGQR2BpKkxXtr5mesUsG3S7gC8fXgikmY2q9QUwfFsGqwST6G1isd2nH1ps7h+9txRhjI+itS0rH3OQz08anyy/h6+sgANwaLWHq9fSMUXIM8PYZkrk89bRKHLUIjDPKBEDmCHJ739tgkezWnZHEzc=;24:PMx6I7EII0rebNtsxuqiofxFMHZTLbtXv9TyOzVmLFAScU6nrJZ9zhDO8b6+IEkwFaJDffMjOg3zZ7m2/NGRzdeC6XXOQU3zgU4cDrMaJJA=;7:+2rSQWcoIzRviwMJIYsK9zj2bgNyF6SqSrwPYIz7Q5/tbXEZT+mZBYDpYzGUSVJyjaYz380azOx1Spv47ei/VKzcXOgcuK2aYu1Lds7jiWsA7vxnq6vF1Ic5x5yDJIkSyRtomrff1Nfy7v3j6/AgTed3pfQlYLmoOojDwqbd+AT6k/eagJyPzHdaEIeGZjcrLvXldpDE0YTmXDOtwaScT3V0pFQoI34gO/Y/LCbExzOfMF/TKJLaR27IIEr5WG4S SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;AM4PR08MB2820;20:mOhkKPKmqySKeFPyzNHOAtzJZ09CEqXgmiYMCsL+OleeytefPaSSbofsbh9YHfbuJ09DLT4pmeo247xM7aGm8f8KRl/rLDX6JGDg8ykH7ipCR0LrpO0ItmfoszjSiXK9vrd5uWqXjTAwqK7LaUL5ZdJKxUKBYsadQ+0PpcHDTVg= X-OriginatorOrg: virtuozzo.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Jan 2018 21:59:46.4084 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 1e7c8641-999f-438a-175f-08d5593ea166 X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 0bc7f26d-0264-416e-a6fc-8352af79c58f X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR08MB2820 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On 01/11/2018 07:29 PM, Michal Hocko wrote: > On Thu 11-01-18 18:23:57, Andrey Ryabinin wrote: >> On 01/11/2018 03:46 PM, Michal Hocko wrote: >>> On Thu 11-01-18 15:21:33, Andrey Ryabinin wrote: >>>> >>>> >>>> On 01/11/2018 01:42 PM, Michal Hocko wrote: >>>>> On Wed 10-01-18 15:43:17, Andrey Ryabinin wrote: >>>>> [...] >>>>>> @@ -2506,15 +2480,13 @@ static int mem_cgroup_resize_limit(struct mem_cgroup *memcg, >>>>>> if (!ret) >>>>>> break; >>>>>> >>>>>> - try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL, !memsw); >>>>>> - >>>>>> - curusage = page_counter_read(counter); >>>>>> - /* Usage is reduced ? */ >>>>>> - if (curusage >= oldusage) >>>>>> - retry_count--; >>>>>> - else >>>>>> - oldusage = curusage; >>>>>> - } while (retry_count); >>>>>> + usage = page_counter_read(counter); >>>>>> + if (!try_to_free_mem_cgroup_pages(memcg, usage - limit, >>>>>> + GFP_KERNEL, !memsw)) { >>>>> >>>>> If the usage drops below limit in the meantime then you get underflow >>>>> and reclaim the whole memcg. I do not think this is a good idea. This >>>>> can also lead to over reclaim. Why don't you simply stick with the >>>>> original SWAP_CLUSTER_MAX (aka 1 for try_to_free_mem_cgroup_pages)? >>>>> >>>> >>>> Because, if new limit is gigabytes bellow the current usage, retrying to set >>>> new limit after reclaiming only 32 pages seems unreasonable. >>> >>> Who would do insanity like that? >>> >> >> What's insane about that? > > I haven't seen this being done in practice. Why would you want to > reclaim GBs of memory from a cgroup? Anyway, if you believe this is > really needed then simply do it in a separate patch. > For the same reason as anyone would want to set memory limit on some job that generates too much pressure and disrupts others. Whether this GBs or MBs is just a matter of scale. More concrete example is workload that generates lots of page cache. Without limit (or with too high limit) it wakes up kswapd which starts trashing all other cgroups. It's pretty bad for anon mostly cgroups as we may constantly swap back and forth hot data. >>>> @@ -2487,8 +2487,8 @@ static int mem_cgroup_resize_limit(struct mem_cgroup *memcg, >>>> if (!ret) >>>> break; >>>> >>>> - usage = page_counter_read(counter); >>>> - if (!try_to_free_mem_cgroup_pages(memcg, usage - limit, >>>> + nr_pages = max_t(long, 1, page_counter_read(counter) - limit); >>>> + if (!try_to_free_mem_cgroup_pages(memcg, nr_pages, >>>> GFP_KERNEL, !memsw)) { >>>> ret = -EBUSY; >>>> break; >>> >>> How does this address the over reclaim concern? >> >> It protects from over reclaim due to underflow. > > I do not think so. Consider that this reclaim races with other > reclaimers. Now you are reclaiming a large chunk so you might end up > reclaiming more than necessary. SWAP_CLUSTER_MAX would reduce the over > reclaim to be negligible. > I did consider this. And I think, I already explained that sort of race in previous email. Whether "Task B" is really a task in cgroup or it's actually a bunch of reclaimers, doesn't matter. That doesn't change anything. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pg0-f71.google.com (mail-pg0-f71.google.com [74.125.83.71]) by kanga.kvack.org (Postfix) with ESMTP id F14276B0253 for ; Thu, 11 Jan 2018 16:59:51 -0500 (EST) Received: by mail-pg0-f71.google.com with SMTP id n2so3131086pgs.0 for ; Thu, 11 Jan 2018 13:59:51 -0800 (PST) Received: from EUR02-AM5-obe.outbound.protection.outlook.com (mail-eopbgr00108.outbound.protection.outlook.com. [40.107.0.108]) by mx.google.com with ESMTPS id 62si14594647pld.618.2018.01.11.13.59.50 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 11 Jan 2018 13:59:50 -0800 (PST) Subject: Re: [PATCH v4] mm/memcg: try harder to decrease [memory,memsw].limit_in_bytes References: <20180109152622.31ca558acb0cc25a1b14f38c@linux-foundation.org> <20180110124317.28887-1-aryabinin@virtuozzo.com> <20180111104239.GZ1732@dhcp22.suse.cz> <4a8f667d-c2ae-e3df-00fd-edc01afe19e1@virtuozzo.com> <20180111124629.GA1732@dhcp22.suse.cz> <20180111162947.GG1732@dhcp22.suse.cz> From: Andrey Ryabinin Message-ID: <560a77b5-02d7-cbae-35f3-0b20a1c384c2@virtuozzo.com> Date: Fri, 12 Jan 2018 00:59:38 +0300 MIME-Version: 1.0 In-Reply-To: <20180111162947.GG1732@dhcp22.suse.cz> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Michal Hocko Cc: Andrew Morton , Johannes Weiner , Vladimir Davydov , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Shakeel Butt On 01/11/2018 07:29 PM, Michal Hocko wrote: > On Thu 11-01-18 18:23:57, Andrey Ryabinin wrote: >> On 01/11/2018 03:46 PM, Michal Hocko wrote: >>> On Thu 11-01-18 15:21:33, Andrey Ryabinin wrote: >>>> >>>> >>>> On 01/11/2018 01:42 PM, Michal Hocko wrote: >>>>> On Wed 10-01-18 15:43:17, Andrey Ryabinin wrote: >>>>> [...] >>>>>> @@ -2506,15 +2480,13 @@ static int mem_cgroup_resize_limit(struct mem_cgroup *memcg, >>>>>> if (!ret) >>>>>> break; >>>>>> >>>>>> - try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL, !memsw); >>>>>> - >>>>>> - curusage = page_counter_read(counter); >>>>>> - /* Usage is reduced ? */ >>>>>> - if (curusage >= oldusage) >>>>>> - retry_count--; >>>>>> - else >>>>>> - oldusage = curusage; >>>>>> - } while (retry_count); >>>>>> + usage = page_counter_read(counter); >>>>>> + if (!try_to_free_mem_cgroup_pages(memcg, usage - limit, >>>>>> + GFP_KERNEL, !memsw)) { >>>>> >>>>> If the usage drops below limit in the meantime then you get underflow >>>>> and reclaim the whole memcg. I do not think this is a good idea. This >>>>> can also lead to over reclaim. Why don't you simply stick with the >>>>> original SWAP_CLUSTER_MAX (aka 1 for try_to_free_mem_cgroup_pages)? >>>>> >>>> >>>> Because, if new limit is gigabytes bellow the current usage, retrying to set >>>> new limit after reclaiming only 32 pages seems unreasonable. >>> >>> Who would do insanity like that? >>> >> >> What's insane about that? > > I haven't seen this being done in practice. Why would you want to > reclaim GBs of memory from a cgroup? Anyway, if you believe this is > really needed then simply do it in a separate patch. > For the same reason as anyone would want to set memory limit on some job that generates too much pressure and disrupts others. Whether this GBs or MBs is just a matter of scale. More concrete example is workload that generates lots of page cache. Without limit (or with too high limit) it wakes up kswapd which starts trashing all other cgroups. It's pretty bad for anon mostly cgroups as we may constantly swap back and forth hot data. >>>> @@ -2487,8 +2487,8 @@ static int mem_cgroup_resize_limit(struct mem_cgroup *memcg, >>>> if (!ret) >>>> break; >>>> >>>> - usage = page_counter_read(counter); >>>> - if (!try_to_free_mem_cgroup_pages(memcg, usage - limit, >>>> + nr_pages = max_t(long, 1, page_counter_read(counter) - limit); >>>> + if (!try_to_free_mem_cgroup_pages(memcg, nr_pages, >>>> GFP_KERNEL, !memsw)) { >>>> ret = -EBUSY; >>>> break; >>> >>> How does this address the over reclaim concern? >> >> It protects from over reclaim due to underflow. > > I do not think so. Consider that this reclaim races with other > reclaimers. Now you are reclaiming a large chunk so you might end up > reclaiming more than necessary. SWAP_CLUSTER_MAX would reduce the over > reclaim to be negligible. > I did consider this. And I think, I already explained that sort of race in previous email. Whether "Task B" is really a task in cgroup or it's actually a bunch of reclaimers, doesn't matter. That doesn't change anything. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrey Ryabinin Subject: Re: [PATCH v4] mm/memcg: try harder to decrease [memory,memsw].limit_in_bytes Date: Fri, 12 Jan 2018 00:59:38 +0300 Message-ID: <560a77b5-02d7-cbae-35f3-0b20a1c384c2@virtuozzo.com> References: <20180109152622.31ca558acb0cc25a1b14f38c@linux-foundation.org> <20180110124317.28887-1-aryabinin@virtuozzo.com> <20180111104239.GZ1732@dhcp22.suse.cz> <4a8f667d-c2ae-e3df-00fd-edc01afe19e1@virtuozzo.com> <20180111124629.GA1732@dhcp22.suse.cz> <20180111162947.GG1732@dhcp22.suse.cz> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=virtuozzo.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=VxNvRGGAK1NxfG+Im8ZMESp7occ3hNdVMbRBEd7dHLk=; b=dYzlcL42SW6DJ/p8uRNH++BiSooa2E4oHPhXyG6kbfg/fqEbk3b8zMw/tZW41XqaH9pyhOuu/2Vv4N3M8X2jlG8yhyb/pxQZaBIiB95zFhXar85BPWBJDcp9aMirWvbyGf7xuuX03XkM3MLhHuue/Hjp3jZlocvDtnUW4kw7YS4= In-Reply-To: <20180111162947.GG1732-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org> Content-Language: en-US Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-ID: Content-Type: text/plain; charset="us-ascii" To: Michal Hocko Cc: Andrew Morton , Johannes Weiner , Vladimir Davydov , cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Shakeel Butt On 01/11/2018 07:29 PM, Michal Hocko wrote: > On Thu 11-01-18 18:23:57, Andrey Ryabinin wrote: >> On 01/11/2018 03:46 PM, Michal Hocko wrote: >>> On Thu 11-01-18 15:21:33, Andrey Ryabinin wrote: >>>> >>>> >>>> On 01/11/2018 01:42 PM, Michal Hocko wrote: >>>>> On Wed 10-01-18 15:43:17, Andrey Ryabinin wrote: >>>>> [...] >>>>>> @@ -2506,15 +2480,13 @@ static int mem_cgroup_resize_limit(struct mem_cgroup *memcg, >>>>>> if (!ret) >>>>>> break; >>>>>> >>>>>> - try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL, !memsw); >>>>>> - >>>>>> - curusage = page_counter_read(counter); >>>>>> - /* Usage is reduced ? */ >>>>>> - if (curusage >= oldusage) >>>>>> - retry_count--; >>>>>> - else >>>>>> - oldusage = curusage; >>>>>> - } while (retry_count); >>>>>> + usage = page_counter_read(counter); >>>>>> + if (!try_to_free_mem_cgroup_pages(memcg, usage - limit, >>>>>> + GFP_KERNEL, !memsw)) { >>>>> >>>>> If the usage drops below limit in the meantime then you get underflow >>>>> and reclaim the whole memcg. I do not think this is a good idea. This >>>>> can also lead to over reclaim. Why don't you simply stick with the >>>>> original SWAP_CLUSTER_MAX (aka 1 for try_to_free_mem_cgroup_pages)? >>>>> >>>> >>>> Because, if new limit is gigabytes bellow the current usage, retrying to set >>>> new limit after reclaiming only 32 pages seems unreasonable. >>> >>> Who would do insanity like that? >>> >> >> What's insane about that? > > I haven't seen this being done in practice. Why would you want to > reclaim GBs of memory from a cgroup? Anyway, if you believe this is > really needed then simply do it in a separate patch. > For the same reason as anyone would want to set memory limit on some job that generates too much pressure and disrupts others. Whether this GBs or MBs is just a matter of scale. More concrete example is workload that generates lots of page cache. Without limit (or with too high limit) it wakes up kswapd which starts trashing all other cgroups. It's pretty bad for anon mostly cgroups as we may constantly swap back and forth hot data. >>>> @@ -2487,8 +2487,8 @@ static int mem_cgroup_resize_limit(struct mem_cgroup *memcg, >>>> if (!ret) >>>> break; >>>> >>>> - usage = page_counter_read(counter); >>>> - if (!try_to_free_mem_cgroup_pages(memcg, usage - limit, >>>> + nr_pages = max_t(long, 1, page_counter_read(counter) - limit); >>>> + if (!try_to_free_mem_cgroup_pages(memcg, nr_pages, >>>> GFP_KERNEL, !memsw)) { >>>> ret = -EBUSY; >>>> break; >>> >>> How does this address the over reclaim concern? >> >> It protects from over reclaim due to underflow. > > I do not think so. Consider that this reclaim races with other > reclaimers. Now you are reclaiming a large chunk so you might end up > reclaiming more than necessary. SWAP_CLUSTER_MAX would reduce the over > reclaim to be negligible. > I did consider this. And I think, I already explained that sort of race in previous email. Whether "Task B" is really a task in cgroup or it's actually a bunch of reclaimers, doesn't matter. That doesn't change anything.