From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752894AbeAIRIb (ORCPT + 1 other); Tue, 9 Jan 2018 12:08:31 -0500 Received: from mail-db5eur01on0115.outbound.protection.outlook.com ([104.47.2.115]:11648 "EHLO EUR01-DB5-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751844AbeAIRI2 (ORCPT ); Tue, 9 Jan 2018 12:08:28 -0500 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=aryabinin@virtuozzo.com; Subject: Re: [PATCH v3 1/2] mm/memcg: try harder to decrease [memory,memsw].limit_in_bytes To: Michal Hocko Cc: Andrew Morton , Johannes Weiner , Vladimir Davydov , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Shakeel Butt References: <20171220135329.GS4831@dhcp22.suse.cz> <20180109165815.8329-1-aryabinin@virtuozzo.com> From: Andrey Ryabinin Message-ID: <99a37cfd-c134-6439-cb6e-81382bc03833@virtuozzo.com> Date: Tue, 9 Jan 2018 20:08:34 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: <20180109165815.8329-1-aryabinin@virtuozzo.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [195.214.232.6] X-ClientProxiedBy: HE1P189CA0002.EURP189.PROD.OUTLOOK.COM (2603:10a6:7:53::15) To DB6PR08MB2822.eurprd08.prod.outlook.com (2603:10a6:6:1d::25) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: f0242b4c-3318-4aac-41be-08d5578398b2 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(4534020)(4602075)(7168020)(4627115)(201703031133081)(201702281549075)(5600026)(4604075)(2017052603307)(7153060)(7193020);SRVR:DB6PR08MB2822; X-Microsoft-Exchange-Diagnostics: 1;DB6PR08MB2822;3:CpN8jhdryJeBpibT1FS2F6KKn2Zay4QuaMkskxArWbC73Y5xg3102/gsAPM/k7Q84wl46GhmU/g3lUrmmUG+SD02sz+0OTkyatSjgEEsg+t1cgcAmbL7mYu+qkndNEHJfjv72S4MV7RW/UARI2dlAXo8tSXOHWroid621AY/QHhmG2niJIYGpitQDc7IgyC6LIBEcT8PlNN8OlTJCGTSfzEO6V/F18NJV0gQVL4N3wb1QyVbgRVn4cJb+mrmK7sA;25:Q8HwcpEygu9V8X0B34lGHdK5z2WUL43/dvYLw4nBtYnJrtnKB8puvzlMOWdPEYpuAJLvSuiIUTi9WuHLOEek/uvYv36dP0lEwQYFm6CfYSf/z2u49e2sqHp5IdcHkQeD9EQ1aJhP0pPhyO7fp7GfRZVWaToimqXFZ0zUjV42Rf3LlLjjgU3jRITGmXxYE75Pb0UrRNWjfowCYR9Vv1iJUcDSZzt/Kh4FRKiO3PhonfJ5/1DNtm7Li3MJ2nGjIseuJ54KSnvmyRnzADmfNBWpqbHxpkHieDVOuNxLJKWYxou+wugSnY47jGMYgLB24KRByHLZZyB1MSnJDGaeqH88/Q==;31:q5sLvvgFkpHxIu6wRNob06V8svZzhnaDmKc5rTuc60lhtB57t1MI7qmnoR2gl4aVFWVv0wLN7+cM2huOsVWydSpYhlP72WkpItEaN2lfunWoaaglJuJry8vDYx25nhyjRkQmQgyTl2pns4a28hOOTBhk7k4d07gPD+L9xy7maBBOaO8BKs2clMSDOpoZDxkflCtz9d0JS9pYosUjdNhabUFEtOgPij2qF4JaeK49i1s= X-MS-TrafficTypeDiagnostic: DB6PR08MB2822: X-Microsoft-Exchange-Diagnostics: 1;DB6PR08MB2822;20:T8Dc1K/NvzEEawG2sxsPKnPrfZ00CcefbBBPMZF1fCW6BTANK7FtqtF3aZ8hKzA39PdDeSc2PxVDXJKC43XYR+3MiaFeXZP+D8lyZLw4nfs6t0uA9r+zkEWq41Cif3nDEFRnoNudEX9igJq+dhoZ4iV0+KY9qJkqfvw4qpviUCM7ncQA4yz0+2ifBniTvJ3N2B9H2EZs0EGX/GuHWjdfcyTl8H9kqq/3mC29YFYC2JoHFAwCxrh7A2QU3nIkpAXHOqF+206wU5Fr6gd0+q4M4OOg2hTSHWzjEMar5g1MHjS8daXiI06dhYf+DSYPS8TOr9mydiLDae457zqEcgshppovE8jS+FTAJWxFuTGz6fK3GMHttdKSUd92bMe+LfJiHBtctdBuMpkEWs6Ag6dL3+H17FVfV0dD8FftzHrISz0=;4:uQE0L6/uusE0iEsS3bsU3GI/NJyT4nHeRZ9Z8jdwCT6F9yBjaNk4mPHE2a1+msHRw7igv1uAANAQiVJqPOylqPHOW0KN4fbMMGNDDJrPDHGdhcZlRRCzB8g7TIDuEMUfYVRRGsJrlas+lXZ8LZFtT+y+Ph73C4k0vV7F3ROtzx5U/vWDnqqAUmwJ0IqZ8+lDZAyLlQeJ2iRKIdtmJ4DhTXSzZkQqWssjpAZRwLtDiKhtvzux64jH5BRkujzn78IW+wTYOxoUzQUuSWaQYR2ckQ== X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040470)(2401047)(8121501046)(5005006)(3231023)(944501075)(3002001)(93006095)(93001095)(10201501046)(6041268)(20161123560045)(20161123562045)(20161123558120)(20161123564045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(6072148)(201708071742011);SRVR:DB6PR08MB2822;BCL:0;PCL:0;RULEID:(100000803101)(100110400095);SRVR:DB6PR08MB2822; X-Forefront-PRVS: 0547116B72 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(6049001)(366004)(396003)(39380400002)(39850400004)(376002)(346002)(24454002)(54534003)(189003)(199004)(31696002)(105586002)(2906002)(8676002)(230700001)(64126003)(478600001)(16526018)(68736007)(31686004)(6246003)(81156014)(36756003)(86362001)(66066001)(97736004)(4326008)(53936002)(106356001)(76176011)(6486002)(25786009)(65806001)(7736002)(8936002)(316002)(54906003)(65956001)(6916009)(6666003)(2486003)(5660300001)(52146003)(23676004)(386003)(39060400002)(55236004)(53546011)(83506002)(305945005)(229853002)(81166006)(50466002)(52116002)(3846002)(58126008)(65826007)(59450400001)(16576012)(77096006)(47776003)(2950100002)(6116002)(34023003)(148693002)(142933001);DIR:OUT;SFP:1102;SCL:1;SRVR:DB6PR08MB2822;H:[172.16.25.12];FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtEQjZQUjA4TUIyODIyOzIzOmdIbXM0QmZKRnZsN3kvRmZWcWVONGMrcFFJ?= =?utf-8?B?a3NGQjVINkxoeEdQN1JNcGlqQjl1L1RxaU1YekNGNmRhOHdYMCtzQndsNTVh?= =?utf-8?B?b3VlNnczWG04ZWd0aHlFMUNFYkE3MWwvS1lDUCt6cWE2bTlKem5CY1pBc2ZM?= =?utf-8?B?ZlhIWEkvWmZmK3psU3BnU2N1ZlJCajBXT0FBc2IwbktocDdTRHljMXdKNUZP?= =?utf-8?B?a01PdU1nU1hJeVc4RHl3MEUwUjBNQzZQa0NUSUNMOWRGU1FnajY0WW9DbWV0?= =?utf-8?B?T1lpSnZkUnNpR0NXRG50UkxWTFBZWDBtQzI3NzRBeFJLdStQdWhMNjZ3STBv?= =?utf-8?B?QkkyelZuMnZYOGV5UnowNDE2dWQram9qYTE4QUhSaWNNMFBrbXl3TW9vcEYr?= =?utf-8?B?cTdYaGdvNVZIUTlHOHVYTm9XWWFRVktQenEwQmZxMVpzSnJKNE9DOU0rdUNm?= =?utf-8?B?OWJIa3lXZjRhMjliSGVYWDdFYnpkNTYyY09hNDZzTjlLbUlKWlI1UmdFaDY4?= =?utf-8?B?a2JMeE5naEE3N3g2Z3g2YzdMajN1REcyYVlnNnR4QjJyNXhYYXVsbDdMNklP?= =?utf-8?B?R0dVSTFQdmtKZHphRXBRNDBDeGZYakppTWdWdnA0d0x0RWlJSjlUV0hCOTJJ?= =?utf-8?B?bFdGVHpEb09tdlpLcW9ZU0g1YXpFbW1yS0h2T1UzTkFidG9rNnc3QW5nUllx?= =?utf-8?B?NzRkVS9nb3U2K0MyTzBSeE9DYnJvbnZXaUtaN0tCa2tVSkprN3REd1dBNVdk?= =?utf-8?B?aXZOSktDMWhmWjBrYWNLYTlaVnhvQ2xyYU5HYzl0K0RnL2d4U2VRM210bWp6?= =?utf-8?B?TXltTWh5V0FraklxZHg0S3IxZGhueSs3Zkdwd1FkS2hPWWs5SnNhbFphMzRX?= =?utf-8?B?UngrWDA1dmNkUnNxTk8vaE5pTXAyK2dBRkFUM3ByZ2djVlNaRmowbXIvQ3dJ?= =?utf-8?B?VW5kWXpIQ1A0ZGNMV3pwbEZhNGRNRVdldlc2SjIrOGRzWGVMQUI5c2s5UlJF?= =?utf-8?B?UVZNdVJKeU5KL2tWZ0xTYmN5WDhhalpVdnhHNmd5empaY1I5Qm9IS1p1Q0RX?= =?utf-8?B?ZmxTR1dSVGVEelAwMjhncGhBRzUvbnRUdG1aZTdHVEFaYlFKUjBmZExSTXdu?= =?utf-8?B?VmFLRVNJNUs3Zmk2SDJRVXpMVFZaZmNBbUxLai9nZm9FWGN4U2hqY3MzcFBt?= =?utf-8?B?bmxkQ1psKzJJQVEvVWN0UEtOVzFjSk84czB1UDVVMndadm12dzZmd2I1Skov?= =?utf-8?B?TDdDdVNYUE9NbkdxYlUrd1M5a1JXVlZwYlRrU3NBWVp2cGltM2lGcDRneXdi?= =?utf-8?B?QS81MUFDOW1yUnZNcExadEprUWxuZkJwbXZxdnNVQ3EvRmxZWWtTMkpvU1pI?= =?utf-8?B?QkFjcEIvYUdCZkRTb205ci9rMWU0eGtTdmNIT3RyNEFlaXdRYlVvWm5XYTFR?= =?utf-8?B?bGtnalpjTHdNYkhleUFzclFHdlFGUHRhSDVnSlpjOWZldEZFb2txQmlicnUy?= =?utf-8?B?cUtZckhvVmdDWDllaGMzV1JXaVdKVVhDV3l6cWNyMnh2ckxtUnZraEg2TWI1?= =?utf-8?B?NkFCRDN4NUVqRHovTGpRKzdYNy9MZWZYUVkwdHZMNnRaMUdUdUE5Vm1nUjdJ?= =?utf-8?B?UTNVaFhzQ1VzaFhuckk5U0hHZzZyVWx3dzArbUo0QXNpaC8vNXBhNzZHWnhW?= =?utf-8?B?bjBDa1g3R0l5Vk5TbHFCU2o5Q0xkdWdNWmJicVFBSS82ZkMvSkd3MnRLditl?= =?utf-8?B?QWNPaVVVaEljQ3l4QzZCeTRVUXNzZmNzenRJeERBZUVqNDRWV2U4OG9DUmFK?= =?utf-8?B?UTBwcVFOWjFFazlGVENoeFZmSmNGVyt5Qi95T3RpOTJ0OWdqWmJicm9ZNXow?= =?utf-8?B?bXIzRUJCTUluaTBJOWoyTUMyblpTN3pIcFVEMVY3WlFuZlNUUzh4TGZuR2RC?= =?utf-8?B?UDh5R3UzYTd5OVkyM1V2ZHpNVVpVU2IyQ0Y2cnp2V2J5OVlkcVZNaXUvSmtE?= =?utf-8?B?NXB3aVJ5VS9qTGp4TEtpYjRtd0N5V3lwTEdSYUlMRU5QM1h2eGJlOTEvYkN5?= =?utf-8?Q?Tw9Se2aSKovr48BeHSP4O5QQJ?= X-Microsoft-Exchange-Diagnostics: 1;DB6PR08MB2822;6:QoB/eJJ80JK9yGUn/K2AURxPjtjVPEqG3p8sNfEkzq1ItvDmTnU1toyUVRhn20jWizY1tRmNiAsOsWy3ipx7fTaqRVT32Ak4rHdnaLb+QZ22nlXvq0hAb3BoEE6ABCdiW+0TAnXw6s6nyr9IbgAf0nmXzlJBiKL3yldM1DUldWAK9wrLFVQea+NNpPY5oON74LxETyQnp7DUfVJAHtmKhSgu9VTzl9GvgCnSys75Ts0ORplsY/JOC8NY//FyRTsuhp+i4GpcKzkHp/bmolY9OjDP3qUY+UBXhxsuq9/JX60Kla2vZvJbmAQvFIQpqVrKeBErw5fWwXkhvv532dcE/5gStNb33oTtxkGgy/0CwP8=;5:FxArDpBIO0guNl6JKJE0BjjE2T32AVp/ry8eaYjS9GgZjou8VIwEj5SRqCX2tZa2d6wrDWf2EDrJ7sOhnpglllAGYAgxNN9da8f6C7OlwbAcTmuVuSEnLEIdTGL+dO9QFi9+IY/GpmAGmeuduhYsj8NTnWRdtJRs54Zqc/mksFg=;24:SMIaSUtuFxnt9Ewr9ttOkW4SvDGy3ZszZBmW/fvQf30b/Jruf6+U4FfTLf7/X9cZ77ijYcNbrdd6vouoMDJWpgbg2+aETH8K/0qy/sFcUbA=;7:Ifo88GY7SPZl5vaNiA0n7/w82ne5li4fy0gon139yvMiEpWZ+2ES3Jq9yFV7wn+PiQ4P6zz0V2Ur8ID/sRBMuW/7ZRJ/fyA2c/ZUG+Bik59DljPavUqSIFZ0conXqvBXX1b66sUBu0lti1jbhC5sDRnNJf/hK1A6c5nVOakWE2hGWjPqH3Sc2zZz5QmXkFMM5N/HWRii7D7I+gKUQVMO6bqs61RKXx/L8413H8pzthP95SHCh4f6QzIyKTc7xZ8c SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;DB6PR08MB2822;20:olwkrhZPtSIAH7Uuz4tO61r5O2o3a2Ztt0KHtBmscDaHx2LO3nwEvSxa4IkagG1jyyKyC38jJNQf5WMFTjMzN+W+xWBWCf+QqHH+CHCiH3I4WJH79qC6tuB73wlfK+iwYM/lInV9mZxESXYSz4C+A92HBnfA7f2+1cNZQLps9C0= X-OriginatorOrg: virtuozzo.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Jan 2018 17:08:24.9444 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: f0242b4c-3318-4aac-41be-08d5578398b2 X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 0bc7f26d-0264-416e-a6fc-8352af79c58f X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR08MB2822 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On 01/09/2018 07:58 PM, Andrey Ryabinin wrote: > mem_cgroup_resize_[memsw]_limit() tries to free only 32 (SWAP_CLUSTER_MAX) > pages on each iteration. This makes practically impossible to decrease > limit of memory cgroup. Tasks could easily allocate back 32 pages, > so we can't reduce memory usage, and once retry_count reaches zero we return > -EBUSY. > > Easy to reproduce the problem by running the following commands: > > mkdir /sys/fs/cgroup/memory/test > echo $$ >> /sys/fs/cgroup/memory/test/tasks > cat big_file > /dev/null & > sleep 1 && echo $((100*1024*1024)) > /sys/fs/cgroup/memory/test/memory.limit_in_bytes > -bash: echo: write error: Device or resource busy > > Instead of relying on retry_count, keep retrying the reclaim until > the desired limit is reached or fail if the reclaim doesn't make > any progress or a signal is pending. > > Signed-off-by: Andrey Ryabinin > --- > > Changes since v2: > - Changelog wording per mhocko@ > Ugh, sorry, I forgot to +Cc Michal this time. Changelog, is the only thing than changed between v2 and v3. > mm/memcontrol.c | 70 +++++++++++++-------------------------------------------- > 1 file changed, 16 insertions(+), 54 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index f40b5ad3f959..0d26db9a665d 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1176,20 +1176,6 @@ void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p) > } > > /* > - * This function returns the number of memcg under hierarchy tree. Returns > - * 1(self count) if no children. > - */ > -static int mem_cgroup_count_children(struct mem_cgroup *memcg) > -{ > - int num = 0; > - struct mem_cgroup *iter; > - > - for_each_mem_cgroup_tree(iter, memcg) > - num++; > - return num; > -} > - > -/* > * Return the memory (and swap, if configured) limit for a memcg. > */ > unsigned long mem_cgroup_get_limit(struct mem_cgroup *memcg) > @@ -2462,22 +2448,10 @@ static DEFINE_MUTEX(memcg_limit_mutex); > static int mem_cgroup_resize_limit(struct mem_cgroup *memcg, > unsigned long limit) > { > - unsigned long curusage; > - unsigned long oldusage; > + unsigned long usage; > bool enlarge = false; > - int retry_count; > int ret; > > - /* > - * For keeping hierarchical_reclaim simple, how long we should retry > - * is depends on callers. We set our retry-count to be function > - * of # of children which we should visit in this loop. > - */ > - retry_count = MEM_CGROUP_RECLAIM_RETRIES * > - mem_cgroup_count_children(memcg); > - > - oldusage = page_counter_read(&memcg->memory); > - > do { > if (signal_pending(current)) { > ret = -EINTR; > @@ -2498,15 +2472,13 @@ static int mem_cgroup_resize_limit(struct mem_cgroup *memcg, > if (!ret) > break; > > - try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL, true); > - > - curusage = page_counter_read(&memcg->memory); > - /* Usage is reduced ? */ > - if (curusage >= oldusage) > - retry_count--; > - else > - oldusage = curusage; > - } while (retry_count); > + usage = page_counter_read(&memcg->memory); > + if (!try_to_free_mem_cgroup_pages(memcg, usage - limit, > + GFP_KERNEL, true)) { > + ret = -EBUSY; > + break; > + } > + } while (true); > > if (!ret && enlarge) > memcg_oom_recover(memcg); > @@ -2517,18 +2489,10 @@ static int mem_cgroup_resize_limit(struct mem_cgroup *memcg, > static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg, > unsigned long limit) > { > - unsigned long curusage; > - unsigned long oldusage; > + unsigned long usage; > bool enlarge = false; > - int retry_count; > int ret; > > - /* see mem_cgroup_resize_res_limit */ > - retry_count = MEM_CGROUP_RECLAIM_RETRIES * > - mem_cgroup_count_children(memcg); > - > - oldusage = page_counter_read(&memcg->memsw); > - > do { > if (signal_pending(current)) { > ret = -EINTR; > @@ -2549,15 +2513,13 @@ static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg, > if (!ret) > break; > > - try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL, false); > - > - curusage = page_counter_read(&memcg->memsw); > - /* Usage is reduced ? */ > - if (curusage >= oldusage) > - retry_count--; > - else > - oldusage = curusage; > - } while (retry_count); > + usage = page_counter_read(&memcg->memsw); > + if (!try_to_free_mem_cgroup_pages(memcg, usage - limit, > + GFP_KERNEL, false)) { > + ret = -EBUSY; > + break; > + } > + } while (true); > > if (!ret && enlarge) > memcg_oom_recover(memcg); > From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f197.google.com (mail-pf0-f197.google.com [209.85.192.197]) by kanga.kvack.org (Postfix) with ESMTP id 7AC0F6B0253 for ; Tue, 9 Jan 2018 12:08:29 -0500 (EST) Received: by mail-pf0-f197.google.com with SMTP id t88so10844167pfg.17 for ; Tue, 09 Jan 2018 09:08:29 -0800 (PST) Received: from EUR01-DB5-obe.outbound.protection.outlook.com (mail-db5eur01on0137.outbound.protection.outlook.com. [104.47.2.137]) by mx.google.com with ESMTPS id h128si10356573pfc.121.2018.01.09.09.08.27 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 09 Jan 2018 09:08:28 -0800 (PST) Subject: Re: [PATCH v3 1/2] mm/memcg: try harder to decrease [memory,memsw].limit_in_bytes References: <20171220135329.GS4831@dhcp22.suse.cz> <20180109165815.8329-1-aryabinin@virtuozzo.com> From: Andrey Ryabinin Message-ID: <99a37cfd-c134-6439-cb6e-81382bc03833@virtuozzo.com> Date: Tue, 9 Jan 2018 20:08:34 +0300 MIME-Version: 1.0 In-Reply-To: <20180109165815.8329-1-aryabinin@virtuozzo.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Michal Hocko Cc: Andrew Morton , Johannes Weiner , Vladimir Davydov , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Shakeel Butt On 01/09/2018 07:58 PM, Andrey Ryabinin wrote: > mem_cgroup_resize_[memsw]_limit() tries to free only 32 (SWAP_CLUSTER_MAX) > pages on each iteration. This makes practically impossible to decrease > limit of memory cgroup. Tasks could easily allocate back 32 pages, > so we can't reduce memory usage, and once retry_count reaches zero we return > -EBUSY. > > Easy to reproduce the problem by running the following commands: > > mkdir /sys/fs/cgroup/memory/test > echo $$ >> /sys/fs/cgroup/memory/test/tasks > cat big_file > /dev/null & > sleep 1 && echo $((100*1024*1024)) > /sys/fs/cgroup/memory/test/memory.limit_in_bytes > -bash: echo: write error: Device or resource busy > > Instead of relying on retry_count, keep retrying the reclaim until > the desired limit is reached or fail if the reclaim doesn't make > any progress or a signal is pending. > > Signed-off-by: Andrey Ryabinin > --- > > Changes since v2: > - Changelog wording per mhocko@ > Ugh, sorry, I forgot to +Cc Michal this time. Changelog, is the only thing than changed between v2 and v3. > mm/memcontrol.c | 70 +++++++++++++-------------------------------------------- > 1 file changed, 16 insertions(+), 54 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index f40b5ad3f959..0d26db9a665d 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1176,20 +1176,6 @@ void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p) > } > > /* > - * This function returns the number of memcg under hierarchy tree. Returns > - * 1(self count) if no children. > - */ > -static int mem_cgroup_count_children(struct mem_cgroup *memcg) > -{ > - int num = 0; > - struct mem_cgroup *iter; > - > - for_each_mem_cgroup_tree(iter, memcg) > - num++; > - return num; > -} > - > -/* > * Return the memory (and swap, if configured) limit for a memcg. > */ > unsigned long mem_cgroup_get_limit(struct mem_cgroup *memcg) > @@ -2462,22 +2448,10 @@ static DEFINE_MUTEX(memcg_limit_mutex); > static int mem_cgroup_resize_limit(struct mem_cgroup *memcg, > unsigned long limit) > { > - unsigned long curusage; > - unsigned long oldusage; > + unsigned long usage; > bool enlarge = false; > - int retry_count; > int ret; > > - /* > - * For keeping hierarchical_reclaim simple, how long we should retry > - * is depends on callers. We set our retry-count to be function > - * of # of children which we should visit in this loop. > - */ > - retry_count = MEM_CGROUP_RECLAIM_RETRIES * > - mem_cgroup_count_children(memcg); > - > - oldusage = page_counter_read(&memcg->memory); > - > do { > if (signal_pending(current)) { > ret = -EINTR; > @@ -2498,15 +2472,13 @@ static int mem_cgroup_resize_limit(struct mem_cgroup *memcg, > if (!ret) > break; > > - try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL, true); > - > - curusage = page_counter_read(&memcg->memory); > - /* Usage is reduced ? */ > - if (curusage >= oldusage) > - retry_count--; > - else > - oldusage = curusage; > - } while (retry_count); > + usage = page_counter_read(&memcg->memory); > + if (!try_to_free_mem_cgroup_pages(memcg, usage - limit, > + GFP_KERNEL, true)) { > + ret = -EBUSY; > + break; > + } > + } while (true); > > if (!ret && enlarge) > memcg_oom_recover(memcg); > @@ -2517,18 +2489,10 @@ static int mem_cgroup_resize_limit(struct mem_cgroup *memcg, > static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg, > unsigned long limit) > { > - unsigned long curusage; > - unsigned long oldusage; > + unsigned long usage; > bool enlarge = false; > - int retry_count; > int ret; > > - /* see mem_cgroup_resize_res_limit */ > - retry_count = MEM_CGROUP_RECLAIM_RETRIES * > - mem_cgroup_count_children(memcg); > - > - oldusage = page_counter_read(&memcg->memsw); > - > do { > if (signal_pending(current)) { > ret = -EINTR; > @@ -2549,15 +2513,13 @@ static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg, > if (!ret) > break; > > - try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL, false); > - > - curusage = page_counter_read(&memcg->memsw); > - /* Usage is reduced ? */ > - if (curusage >= oldusage) > - retry_count--; > - else > - oldusage = curusage; > - } while (retry_count); > + usage = page_counter_read(&memcg->memsw); > + if (!try_to_free_mem_cgroup_pages(memcg, usage - limit, > + GFP_KERNEL, false)) { > + ret = -EBUSY; > + break; > + } > + } while (true); > > if (!ret && enlarge) > memcg_oom_recover(memcg); > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrey Ryabinin Subject: Re: [PATCH v3 1/2] mm/memcg: try harder to decrease [memory,memsw].limit_in_bytes Date: Tue, 9 Jan 2018 20:08:34 +0300 Message-ID: <99a37cfd-c134-6439-cb6e-81382bc03833@virtuozzo.com> References: <20171220135329.GS4831@dhcp22.suse.cz> <20180109165815.8329-1-aryabinin@virtuozzo.com> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=virtuozzo.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=4nKTZX0tXifW00sicRg/d/LSvejRnr2K+pfZws03Ye8=; b=IJV+IXmhD3CZGX5rq/Z4/mou575gajN2OJ0ARmXa3AqPVuNtQGB5wsmXxEe+p4oksGF0aYq+W7fma4pz9REdyD2hxFDB0qA3aZXphlREOjfOTUeJdVO0ynnewZQK2xe4+8dfDThaoaUjwEzHoJ0VjNG6ngXvj3WJ9Pn4wPiX7M4= In-Reply-To: <20180109165815.8329-1-aryabinin-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org> Content-Language: en-US Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-ID: Content-Type: text/plain; charset="us-ascii" To: Michal Hocko Cc: Andrew Morton , Johannes Weiner , Vladimir Davydov , cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Shakeel Butt On 01/09/2018 07:58 PM, Andrey Ryabinin wrote: > mem_cgroup_resize_[memsw]_limit() tries to free only 32 (SWAP_CLUSTER_MAX) > pages on each iteration. This makes practically impossible to decrease > limit of memory cgroup. Tasks could easily allocate back 32 pages, > so we can't reduce memory usage, and once retry_count reaches zero we return > -EBUSY. > > Easy to reproduce the problem by running the following commands: > > mkdir /sys/fs/cgroup/memory/test > echo $$ >> /sys/fs/cgroup/memory/test/tasks > cat big_file > /dev/null & > sleep 1 && echo $((100*1024*1024)) > /sys/fs/cgroup/memory/test/memory.limit_in_bytes > -bash: echo: write error: Device or resource busy > > Instead of relying on retry_count, keep retrying the reclaim until > the desired limit is reached or fail if the reclaim doesn't make > any progress or a signal is pending. > > Signed-off-by: Andrey Ryabinin > --- > > Changes since v2: > - Changelog wording per mhocko@ > Ugh, sorry, I forgot to +Cc Michal this time. Changelog, is the only thing than changed between v2 and v3. > mm/memcontrol.c | 70 +++++++++++++-------------------------------------------- > 1 file changed, 16 insertions(+), 54 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index f40b5ad3f959..0d26db9a665d 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1176,20 +1176,6 @@ void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p) > } > > /* > - * This function returns the number of memcg under hierarchy tree. Returns > - * 1(self count) if no children. > - */ > -static int mem_cgroup_count_children(struct mem_cgroup *memcg) > -{ > - int num = 0; > - struct mem_cgroup *iter; > - > - for_each_mem_cgroup_tree(iter, memcg) > - num++; > - return num; > -} > - > -/* > * Return the memory (and swap, if configured) limit for a memcg. > */ > unsigned long mem_cgroup_get_limit(struct mem_cgroup *memcg) > @@ -2462,22 +2448,10 @@ static DEFINE_MUTEX(memcg_limit_mutex); > static int mem_cgroup_resize_limit(struct mem_cgroup *memcg, > unsigned long limit) > { > - unsigned long curusage; > - unsigned long oldusage; > + unsigned long usage; > bool enlarge = false; > - int retry_count; > int ret; > > - /* > - * For keeping hierarchical_reclaim simple, how long we should retry > - * is depends on callers. We set our retry-count to be function > - * of # of children which we should visit in this loop. > - */ > - retry_count = MEM_CGROUP_RECLAIM_RETRIES * > - mem_cgroup_count_children(memcg); > - > - oldusage = page_counter_read(&memcg->memory); > - > do { > if (signal_pending(current)) { > ret = -EINTR; > @@ -2498,15 +2472,13 @@ static int mem_cgroup_resize_limit(struct mem_cgroup *memcg, > if (!ret) > break; > > - try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL, true); > - > - curusage = page_counter_read(&memcg->memory); > - /* Usage is reduced ? */ > - if (curusage >= oldusage) > - retry_count--; > - else > - oldusage = curusage; > - } while (retry_count); > + usage = page_counter_read(&memcg->memory); > + if (!try_to_free_mem_cgroup_pages(memcg, usage - limit, > + GFP_KERNEL, true)) { > + ret = -EBUSY; > + break; > + } > + } while (true); > > if (!ret && enlarge) > memcg_oom_recover(memcg); > @@ -2517,18 +2489,10 @@ static int mem_cgroup_resize_limit(struct mem_cgroup *memcg, > static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg, > unsigned long limit) > { > - unsigned long curusage; > - unsigned long oldusage; > + unsigned long usage; > bool enlarge = false; > - int retry_count; > int ret; > > - /* see mem_cgroup_resize_res_limit */ > - retry_count = MEM_CGROUP_RECLAIM_RETRIES * > - mem_cgroup_count_children(memcg); > - > - oldusage = page_counter_read(&memcg->memsw); > - > do { > if (signal_pending(current)) { > ret = -EINTR; > @@ -2549,15 +2513,13 @@ static int mem_cgroup_resize_memsw_limit(struct mem_cgroup *memcg, > if (!ret) > break; > > - try_to_free_mem_cgroup_pages(memcg, 1, GFP_KERNEL, false); > - > - curusage = page_counter_read(&memcg->memsw); > - /* Usage is reduced ? */ > - if (curusage >= oldusage) > - retry_count--; > - else > - oldusage = curusage; > - } while (retry_count); > + usage = page_counter_read(&memcg->memsw); > + if (!try_to_free_mem_cgroup_pages(memcg, usage - limit, > + GFP_KERNEL, false)) { > + ret = -EBUSY; > + break; > + } > + } while (true); > > if (!ret && enlarge) > memcg_oom_recover(memcg); >