From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754149AbeAKFPa (ORCPT + 1 other); Thu, 11 Jan 2018 00:15:30 -0500 Received: from mail-pf0-f195.google.com ([209.85.192.195]:43685 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750975AbeAKFP3 (ORCPT ); Thu, 11 Jan 2018 00:15:29 -0500 X-Google-Smtp-Source: ACJfBot3NUr3WgRqFNTYk981lRKPNJUqkFkhBTB11Hn0ypfLBQOIbRJKuY7situbDzZBQMRkYGGabg== Date: Thu, 11 Jan 2018 14:15:24 +0900 From: Sergey Senozhatsky To: Tejun Heo , Peter Zijlstra Cc: Petr Mladek , Linus Torvalds , akpm@linux-foundation.org, Steven Rostedt , Sergey Senozhatsky , linux-mm@kvack.org, Cong Wang , Dave Hansen , Johannes Weiner , Mel Gorman , Michal Hocko , Vlastimil Babka , Jan Kara , Mathieu Desnoyers , Tetsuo Handa , rostedt@home.goodmis.org, Byungchul Park , Sergey Senozhatsky , Pavel Machek , linux-kernel@vger.kernel.org Subject: Re: [PATCH v5 0/2] printk: Console owner and waiter logic cleanup Message-ID: <20180111051524.GC494@jagdpanzerIV> References: <20180110132418.7080-1-pmladek@suse.com> <20180110140547.GZ3668920@devbig577.frc2.facebook.com> <20180110162900.GA21753@linux.suse> <20180110170223.GF3668920@devbig577.frc2.facebook.com> <20180110182153.GP6176@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180110182153.GP6176@hirez.programming.kicks-ass.net> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On (01/10/18 19:21), Peter Zijlstra wrote: > > On Wed, Jan 10, 2018 at 09:02:23AM -0800, Tejun Heo wrote: > > 2. System runs out of memory, OOM triggers. > > 3. OOM handler is printing out OOM debug info. > > 4. While trying to emit the messages for netconsole, the network stack > > / driver tries to allocate memory and then fail, which in turn > > triggers allocation failure or other warning messages. printk was > > already flushing, so the messages are queued on the ring. > > 5. OOM handler keeps flushing but 4 repeats and the queue is never > > shrinking. Because OOM handler is trapped in printk flushing, it > > never manages to free memory and no one else can enter OOM path > > either, so the system is trapped in this state. > > Why not kill recursive OOM (msgs) ? hm... do I understand it correctly that there is a console_unlock()->call_console_drivers()->FOO_write()->kmalloc()->printk() recursion? we call console drivers from printk-safe context now. so those printks from kmalloc are redirected to per-CPU printk-safe buffer, which is limited in size (we probably might start losing some of those OOM messages) and which is flushed (log_store()) from another context. -ss