From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.7 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99318C38A2B for ; Fri, 17 Apr 2020 15:26:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 54B6F20936 for ; Fri, 17 Apr 2020 15:26:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 54B6F20936 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 88C228E002E; Fri, 17 Apr 2020 11:26:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 816B78E0023; Fri, 17 Apr 2020 11:26:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7056E8E002E; Fri, 17 Apr 2020 11:26:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0168.hostedemail.com [216.40.44.168]) by kanga.kvack.org (Postfix) with ESMTP id 542938E0023 for ; Fri, 17 Apr 2020 11:26:41 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 1B80652C3 for ; Fri, 17 Apr 2020 15:26:41 +0000 (UTC) X-FDA: 76717724202.07.rat96_1d35046906e23 X-HE-Tag: rat96_1d35046906e23 X-Filterd-Recvd-Size: 7583 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf28.hostedemail.com (Postfix) with ESMTP for ; Fri, 17 Apr 2020 15:26:40 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id D8DC6ACAE; Fri, 17 Apr 2020 15:26:37 +0000 (UTC) Date: Fri, 17 Apr 2020 17:26:37 +0200 From: Michal Hocko To: Tetsuo Handa Cc: akpm@linux-foundation.org, guro@fb.com, linux-mm , rientjes@google.com, shakeelb@google.com, Yafang Shao Subject: Re: [nacked] mm-oom-avoid-printk-iteration-under-rcu.patch removed from -mm tree Message-ID: <20200417152637.GQ26707@dhcp22.suse.cz> References: <20200131043324.wkArXTf6-%akpm@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri 17-04-20 23:33:41, Tetsuo Handa wrote: > On 2020/01/31 13:33, akpm@linux-foundation.org wrote: > > The patch titled > > Subject: mm, oom: avoid printk() iteration under RCU > > has been removed from the -mm tree. Its filename was > > mm-oom-avoid-printk-iteration-under-rcu.patch > > > > This patch was dropped because it was nacked > > This patch was once nacked by Michal Hocko. But based on > https://lkml.kernel.org/r/CALOAHbArHa-QZ2hXnLOhEJ3Ti=C69-LPtjZB9zqPcuTKeHsV4g@mail.gmail.com > that dump_tasks() will become a real world problem with slow consoles (even if > printk() becomes asynchronous), I propose you to revive this patch as a first step for > implementing workable ratelimit for dump_tasks(). This patch can be applied as-is > on current linux.git and linux-next.git (only offset difference). My nack still applies. Slow consoles are really a pain and they are going to be addressed by the upcoming printk work from a large part. Your patch doesn't address the problem it merely shift it around while making code even more obscure and harder to maintain. For the specific issue that you are pointing out, a simple and working workaround is to reduce the loglevel or disable dump_tasks. There is no real reason to add ugly kluges IMHO. > > ------------------------------------------------------ > > From: Tetsuo Handa > > Subject: mm, oom: avoid printk() iteration under RCU > > > > Currently dump_tasks() might call printk() for many thousands times under > > RCU, which might take many minutes for slow consoles. Therefore, split > > dump_tasks() into three stages; take a snapshot of possible OOM victim > > candidates under RCU, dump the snapshot from reschedulable context, and > > destroy the snapshot. > > > > In a future patch, the first stage would be moved to select_bad_process() > > and the third stage would be moved to after oom_kill_process(), and will > > simplify refcount handling. > > > > [akpm@linux-foundation.org: make dump_tasks::list non-static] > > Link: http://lkml.kernel.org/r/1563360901-8277-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp > > Signed-off-by: Tetsuo Handa > > Cc: Shakeel Butt > > Cc: Michal Hocko > > Cc: Roman Gushchin > > Cc: David Rientjes > > Signed-off-by: Andrew Morton > > --- > > > > include/linux/sched.h | 1 > > mm/oom_kill.c | 67 +++++++++++++++++++--------------------- > > 2 files changed, 34 insertions(+), 34 deletions(-) > > > > --- a/include/linux/sched.h~mm-oom-avoid-printk-iteration-under-rcu > > +++ a/include/linux/sched.h > > @@ -1261,6 +1261,7 @@ struct task_struct { > > #ifdef CONFIG_MMU > > struct task_struct *oom_reaper_list; > > #endif > > + struct list_head oom_victim_list; > > #ifdef CONFIG_VMAP_STACK > > struct vm_struct *stack_vm_area; > > #endif > > --- a/mm/oom_kill.c~mm-oom-avoid-printk-iteration-under-rcu > > +++ a/mm/oom_kill.c > > @@ -377,36 +377,13 @@ static void select_bad_process(struct oo > > } > > } > > > > -static int dump_task(struct task_struct *p, void *arg) > > -{ > > - struct oom_control *oc = arg; > > - struct task_struct *task; > > - > > - if (oom_unkillable_task(p)) > > - return 0; > > > > - /* p may not have freeable memory in nodemask */ > > - if (!is_memcg_oom(oc) && !oom_cpuset_eligible(p, oc)) > > - return 0; > > - > > - task = find_lock_task_mm(p); > > - if (!task) { > > - /* > > - * This is a kthread or all of p's threads have already > > - * detached their mm's. There's no need to report > > - * them; they can't be oom killed anyway. > > - */ > > - return 0; > > +static int add_candidate_task(struct task_struct *p, void *arg) > > +{ > > + if (!oom_unkillable_task(p)) { > > + get_task_struct(p); > > + list_add_tail(&p->oom_victim_list, (struct list_head *) arg); > > } > > - > > - pr_info("[%7d] %5d %5d %8lu %8lu %8ld %8lu %5hd %s\n", > > - task->pid, from_kuid(&init_user_ns, task_uid(task)), > > - task->tgid, task->mm->total_vm, get_mm_rss(task->mm), > > - mm_pgtables_bytes(task->mm), > > - get_mm_counter(task->mm, MM_SWAPENTS), > > - task->signal->oom_score_adj, task->comm); > > - task_unlock(task); > > - > > return 0; > > } > > > > @@ -422,19 +399,41 @@ static int dump_task(struct task_struct > > */ > > static void dump_tasks(struct oom_control *oc) > > { > > - pr_info("Tasks state (memory values in pages):\n"); > > - pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name\n"); > > + LIST_HEAD(list); > > + struct task_struct *p; > > + struct task_struct *t; > > > > if (is_memcg_oom(oc)) > > - mem_cgroup_scan_tasks(oc->memcg, dump_task, oc); > > + mem_cgroup_scan_tasks(oc->memcg, add_candidate_task, &list); > > else { > > - struct task_struct *p; > > - > > rcu_read_lock(); > > for_each_process(p) > > - dump_task(p, oc); > > + add_candidate_task(p, &list); > > rcu_read_unlock(); > > } > > + pr_info("Tasks state (memory values in pages):\n"); > > + pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name\n"); > > + list_for_each_entry(p, &list, oom_victim_list) { > > + cond_resched(); > > + /* p may not have freeable memory in nodemask */ > > + if (!is_memcg_oom(oc) && !oom_cpuset_eligible(p, oc)) > > + continue; > > + /* All of p's threads might have already detached their mm's. */ > > + t = find_lock_task_mm(p); > > + if (!t) > > + continue; > > + pr_info("[%7d] %5d %5d %8lu %8lu %8ld %8lu %5hd %s\n", > > + t->pid, from_kuid(&init_user_ns, task_uid(t)), > > + t->tgid, t->mm->total_vm, get_mm_rss(t->mm), > > + mm_pgtables_bytes(t->mm), > > + get_mm_counter(t->mm, MM_SWAPENTS), > > + t->signal->oom_score_adj, t->comm); > > + task_unlock(t); > > + } > > + list_for_each_entry_safe(p, t, &list, oom_victim_list) { > > + list_del(&p->oom_victim_list); > > + put_task_struct(p); > > + } > > } > > > > static void dump_oom_summary(struct oom_control *oc, struct task_struct *victim) > > _ > > > > Patches currently in -mm which might be from penguin-kernel@I-love.SAKURA.ne.jp are > > > > > -- Michal Hocko SUSE Labs