From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757038AbcFHOH3 (ORCPT ); Wed, 8 Jun 2016 10:07:29 -0400 Received: from mail-db3on0142.outbound.protection.outlook.com ([157.55.234.142]:54701 "EHLO emea01-db3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751736AbcFHOH1 (ORCPT ); Wed, 8 Jun 2016 10:07:27 -0400 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=VDavydov@virtuozzo.com; Date: Wed, 8 Jun 2016 16:52:04 +0300 From: Vladimir Davydov To: Michal Hocko CC: Andrew Morton , Tetsuo Handa , David Rientjes , Johannes Weiner , , Subject: Re: [PATCH 2/2] mm: oom: deduplicate victim selection code for memcg and global oom Message-ID: <20160608135204.GA30465@esperanza> References: <40e03fd7aaf1f55c75d787128d6d17c5a71226c2.1464358556.git.vdavydov@virtuozzo.com> <3bbc7b70dae6ace0b8751e0140e878acfdfffd74.1464358556.git.vdavydov@virtuozzo.com> <20160608083334.GF22570@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <20160608083334.GF22570@dhcp22.suse.cz> X-Originating-IP: [195.214.232.10] X-ClientProxiedBy: AM3PR08CA0011.eurprd08.prod.outlook.com (2a01:111:e400:8840::21) To AM3PR08MB0579.eurprd08.prod.outlook.com (2a01:111:e400:c408::13) X-MS-Office365-Filtering-Correlation-Id: 1c000274-dd40-4cde-fd1f-08d38fa415be X-Microsoft-Exchange-Diagnostics: 1;AM3PR08MB0579;2:YwiTuSfILmy30z+cKG9PA+T/4H4td625fhAwPd5+mclrJH7D6EFPGj10W1XcZGSh1YENvQcVj1i5F35hm36D+y+DAupojLkOhombzFHFJN676NVP9XBKVQUwpnj5airqmN9nV5kMkLV/ozMykzSpD8N/3RSyWGhu9sX1gOd11FYdev3t50sJWCqchKlfMwEp;3:mEozJxxUaJ1f4KVL+22MABku6ADqMlOv1L830PZBizyQSVvkSOvEqHv1G5uRVzPCyoY7hSRTxNukkDtGHCdonDTnm+y14iXa3okwIapBDSwNNt0ylu9BDH6H+r390kF7 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AM3PR08MB0579; X-Microsoft-Exchange-Diagnostics: 1;AM3PR08MB0579;25:GBlTGFWBaK8aUTn6DOR1hrWhEfLx5t3dTT8c5y/J1bQ2qk+9G6aoe/sMYdScwqk1LhtmPfMQOo1rnRa1JQhZBjxrBE5fSpTN+IQBBR90KstaKVRcW0JEXwcL4epl4pFqlqqKBT2oUcn8ZCn1gQD+J/EFvFnllEoL85Mb4o9Sqz/vA84gjxy2tCXxxQwCPYeLS2+VXb3G0Mc2mCzd8BHHbwcF5BbzTn4vjqCiyXvpMEgsspp/9jYH8ZW0BXUB8E8JgDzzDnKJrO7mrgKUNcB8Am9iha+lKDihpQOYUD6emPl+JJ4L1xa3imtnp2csGRxPjnMLnwMtwN3P17+9HhduIWW+hNH01m9OVYARzxk7VcLmgyu9By1oWxFRs+koRm92MPfoGDcO5TWxiegJ3M9PxGYcNr0eLyvKJib73WdfX1VogEo3IW9aL3lime4ci2Ybain3euKpOtwegTzwC5MbKZSK9SWhBrXhwiz6lPo9WpbUdaZBIT/YqhIo3dPkKYaA3VlXEYn28DOJMV4Pfbvr9LG1APQ6E6vMXnYIdOCi/Ffhe7itj2ci+imErwl/nyjNjR2HpprIK4opGXpIzyRAKIi57BOiTLNcbtE03dC2slyU3ILq7ghSqPhbDmNzsXfzTI3isx5I7KhAyzlvLuwXS1x0lqSaUhSNUyejKWLim6Tiu6grgXc+l6Q7NBXpf7tBUhM9/hVEIVBwM+8OFJJg8Xrae86jU+ZnAFpvgpM5k/M= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040130)(601004)(2401047)(5005006)(8121501046)(3002001)(10201501046)(6041072)(6043046);SRVR:AM3PR08MB0579;BCL:0;PCL:0;RULEID:;SRVR:AM3PR08MB0579; X-Microsoft-Exchange-Diagnostics: 1;AM3PR08MB0579;4:9zBY2Jb5P257w0ZwMukdPaUQKPN2Ubz1ELUJy+abVUbFdT/zie+Bq1gXeEZ0S2Z5CkuOjUurUS8xi/WIyTv3Y8JGSwHu0nq4QswQkgYkOGBdrQ8p/UyGNqMBw5VHhKyaXTxVGxpkkmE+Sju5MryNg/aLpygnXEVnnUn11eD/PrdPUMcJuFsUSydRi/UlI7ePKA7VKDj3nB+Yq/G/95aRh5qEFjvlC2c2Mf3ZD4w1cW7dh5OEBFZ48+I0/OpHwizVyPueauLfdxQgkT8d6XKSPFwPJyKXk6VgDIjwxzwqwYafzn29wzo+0HIT7nN1GiSzoF1MpQH65KOcAT4CLwWJT52vXOK/mCTpJRpOLnoxaBcxk0fSyW70UGTd3kpkAx9xqY1IDxoPyKEMOMsQxBuRxuQ9N132Ll7F2c+rRSOf6Ms= X-Forefront-PRVS: 0967749BC1 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6009001)(199003)(189002)(24454002)(377424004)(68736007)(33656002)(5008740100001)(80792005)(5004730100002)(8676002)(101416001)(9686002)(76176999)(2906002)(50986999)(54356999)(4326007)(23726003)(33716001)(1076002)(6116002)(586003)(66066001)(97756001)(47776003)(46406003)(92566002)(97736004)(105586002)(81156014)(86362001)(42186005)(106356001)(3846002)(110136002)(189998001)(50466002)(81166006)(2950100001)(309714004);DIR:OUT;SFP:1102;SCL:1;SRVR:AM3PR08MB0579;H:esperanza;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;AM3PR08MB0579;23:xkxQDeMuG6GKTp8sZzQqsWHe2yOSomSbpC0m7nC6g?= =?us-ascii?Q?qNOubLCNZYbFOFCLKTL6rsMNI4SJmhWH4OEj8xazpDObGM/Vj3g8OJugGOfC?= =?us-ascii?Q?7IVmn6fnBlpIsDEAcqZbvJw9izSzkZMoU4DZoGaSTzYaPaMMNv6M3LfrwnNq?= =?us-ascii?Q?nouQjZF8Hfb2VPAJFz1cBAiGcs+aVehMCivwP5lqk+pTe8sxRusIqtCTdHIa?= =?us-ascii?Q?dBrcI95nYEAXbf9/G/0S3/7D3+/KfzWhI+ZeZqWV38QF84D0hkzL62JledTQ?= =?us-ascii?Q?nGckv4kws3CHx+3F5qD4HtVf4J2JittsHYgfk2azWNJ7shoHaieVp4jZZLQn?= =?us-ascii?Q?6QSq2E8QQB8jXQpIgoyqJ3YgBbZcrIKSNuzcS7gGmwCAgJ7Xe957qP1vf1eH?= =?us-ascii?Q?CPT07mFY0tErKrLd4xypn1nZooltZlOxoo/HUDj58vEvUJZBZsNKa2C0KqFB?= =?us-ascii?Q?dkxseP1kjTd0BW6FiWdRo8kXUIwHdalEkxYl19Gr7fojubM7cyL9bTID+hMu?= =?us-ascii?Q?IRmtXVVqO++eqLJ8cRKaeHur+WC1vA4iDHdVp0kvrb/s8V0YL6KUW1RUrevV?= =?us-ascii?Q?PVvU3Qa4pWT8mqZCdzNZzNzDSsJ4ZH5XHAH9Ncw3DITa1sPzQgNW/QZcz0+Q?= =?us-ascii?Q?Yp6XyWkOCgWrdZm1MWfx2Vmoa++tFWErQfChYJpWrI5CA35cUIyxxTqapPxL?= =?us-ascii?Q?pzxKHgW9m3Fac1xMvMt6JEIi+64r8MbtvoWawebP6+nqC7ITF78YAL8ZJVb2?= =?us-ascii?Q?+XOKtFsl6ZtW6RWqPRyx9XmuJ+D1J4c8UBG9jHnViIjIUkDZMtsH0uhP/mc3?= =?us-ascii?Q?u+Kb8diKa8n4glVonL2E9HMv8Ma1QuYZib4ztlXH5U54tY51XOqE5RhNBCiV?= =?us-ascii?Q?D4KT2laQIhnw+jBtXvvoG5iu8ECf1e1DxHHQEDO+kwJb1d2KwQ8air7ejRlr?= =?us-ascii?Q?BrfRBY7iVLUSx11WRu48fReCu8DgMYhiHyHyH0b4I+MUQnFmJ2CBRn3LfKT1?= =?us-ascii?Q?X1RTkvC6aW3dMhmqp+9zg2SPsDeeX4pa5jMLzSE3s7yYfaen0NrUCIj3KxxT?= =?us-ascii?Q?a+WVlqh8iyOuhmvoertD9VY+Dtp?= X-Microsoft-Exchange-Diagnostics: 1;AM3PR08MB0579;5:jqpHF2nKN7YKo7c0tBOvhbYm2iySGqE69DcWlb+AscL5LtfhkSZkT3H/iBwiQaiamOlCFNNvFlBHKGle5dkQAk6bBYKwbzlJ1U9210vhyCK4lAfUanJ61dln/KWWhtda0+nHSQ+z7D2nB/Z4btfSvA==;24:VBFF5HesflMq1XOhTOH4YFcNFOkzLYPl7cLLp4HwI8bzVAgn1TV79x7dokWIgK4cOUJrEVpTdSF93UwoENP5YcZnCP+BT4l+soU2TI7FnzQ=;7:sjQdoq8M2q9zlTimsKXJtnhshLehjzfodjJbQlSAJqMljZNLKNnjvRElZJSrUQ7KLBFVkJWMpXiTIaH8xXDYsin+kH4CVEUr5fYhNn5zfNNDpk1k14bBToP+iTd+47wfPin7xaZumjXrjE5ijwTNca0A5hzaHRZPHOBQGypefnwBSDlf9htqfv/DsqMfg2duHDfxtVacO4/dx7EmiXqYdEyvTKjFkV/fkD0D4F7KvP0=;20:RPUZ9c85GITIUorHmEPlABuJNuPbXVqTgWdlI6WhMj7p3tku5Ce1xYEYnLy5GnBbaT+V0aLrB7KBqF0nD4KUs2PSPMAm+v31rVkNv/A6R1y4t3j9AsR9xDysvxSa0xb+RPhWtU+j4GJnPEy7/AYabKUh02YzMhY06LkslTf8iHc= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: virtuozzo.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jun 2016 13:52:08.6774 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM3PR08MB0579 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 08, 2016 at 10:33:34AM +0200, Michal Hocko wrote: > On Fri 27-05-16 17:17:42, Vladimir Davydov wrote: > [...] > > @@ -970,26 +1028,25 @@ bool out_of_memory(struct oom_control *oc) > > !oom_unkillable_task(current, NULL, oc->nodemask) && > > current->signal->oom_score_adj != OOM_SCORE_ADJ_MIN) { > > get_task_struct(current); > > - oom_kill_process(oc, current, 0, totalpages, > > - "Out of memory (oom_kill_allocating_task)"); > > + oom_kill_process(oc, current, 0, totalpages); > > return true; > > } > > Do we really want to introduce sysctl_oom_kill_allocating_task to memcg > as well? Not sure, but why not? We take into account dump_tasks and panic_on_oom on memcg oom so why should we treat this sysctl differently? > The heuristic is quite dubious even for the global context IMHO > because it leads to a very random behavior. > > > p = select_bad_process(oc, &points, totalpages); > > /* Found nothing?!?! Either we hang forever, or we panic. */ > > - if (!p && !is_sysrq_oom(oc)) { > > + if (!p && !is_sysrq_oom(oc) && !oc->memcg) { > > dump_header(oc, NULL); > > panic("Out of memory and no killable processes...\n"); > > } > > if (p && p != (void *)-1UL) { > > - oom_kill_process(oc, p, points, totalpages, "Out of memory"); > > + oom_kill_process(oc, p, points, totalpages); > > /* > > * Give the killed process a good chance to exit before trying > > * to allocate memory again. > > */ > > schedule_timeout_killable(1); > > } > > - return true; > > + return !!p; > > } > > Now if you look at out_of_memory() the only shared "heuristic" with the > memcg part is the bypass for the exiting tasks. bypass exiting task (task_will_free_mem) check for panic (check_panic_on_oom) oom badness evaluation (oom_scan_process_thread or oom_evaluate_task after your patch) points calculation + kill (oom_kill_process) And if you need to modify any of these function calls or add yet another check, you have to do it twice. Ugly. > Plus both need the oom_lock. I believe locking could be unified for global/memcg oom cases too. > You have to special case oom notifiers, panic on no victim handling and > I guess the oom_kill_allocating task is not intentional either. So I > am not really sure this is an improvement. I even hate how we conflate > sysrq vs. regular global oom context together but my cleanup for that > has failed in the past. > > The victim selection code can be reduced because it is basically > shared between the two, only the iterator differs. But I guess that > can be eliminated by a simple helper. IMHO exporting a bunch of very oom-specific helpers (like those I enumerated above), partially revealing oom implementation, instead of well defined memcg helpers that could be reused anywhere else looks ugly. It's like having shrink_zone implementation both in vmscan.c and memcontrol.c with shrink_slab, shrink_lruvec, etc. exported, because we need to iterate over cgroups there.