From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759496AbeAIUG3 (ORCPT + 1 other); Tue, 9 Jan 2018 15:06:29 -0500 Received: from mail-qk0-f181.google.com ([209.85.220.181]:33287 "EHLO mail-qk0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754892AbeAIUGY (ORCPT ); Tue, 9 Jan 2018 15:06:24 -0500 X-Google-Smtp-Source: ACJfBovRwZZFzKOVo2onCEqs0F8tmuiN/xsAoV3JYy8h7mRIU5d4AkcA1TuVPnCly+8zyxMEPt2nog== Date: Tue, 9 Jan 2018 12:06:20 -0800 From: Tejun Heo To: Steven Rostedt Cc: Petr Mladek , Sergey Senozhatsky , Jan Kara , Andrew Morton , Peter Zijlstra , Rafael Wysocki , Pavel Machek , Tetsuo Handa , linux-kernel@vger.kernel.org, Sergey Senozhatsky Subject: Re: [RFC][PATCHv6 00/12] printk: introduce printing kernel thread Message-ID: <20180109200620.GQ3668920@devbig577.frc2.facebook.com> References: <20171204134825.7822-1-sergey.senozhatsky@gmail.com> <20171214142709.trgl76hbcdwaczzd@pathway.suse.cz> <20171214152551.GY3919388@devbig577.frc2.facebook.com> <20171214125506.52a7e5fa@gandalf.local.home> <20171214181153.GZ3919388@devbig577.frc2.facebook.com> <20171214132109.32ae6a74@gandalf.local.home> <20171222000932.GG1084507@devbig577.frc2.facebook.com> <20171221231932.27727fab@vmware.local.home> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171221231932.27727fab@vmware.local.home> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: Hello, Steven. My apologies for the late reply. Was traveling and then got sick. On Thu, Dec 21, 2017 at 11:19:32PM -0500, Steven Rostedt wrote: > You don't think handing off printks to an offloaded thread isn't more > complex nor can it cause even more issues (like more likely to lose > relevant information on kernel crashes)? Sergey's patch seems more complex (and probably handles more requirements) but my original patch was pretty simple. http://lkml.kernel.org/r/20171102135258.GO3252168@devbig577.frc2.facebook.com > > static enum hrtimer_restart printk_timerfn(struct hrtimer *timer) > > { > > int i; > > > > if (READ_ONCE(in_printk)) > > for (i = 0; i < 10000; i++) > > printk("%-80s\n", "XXX TIMER"); > > WTF! > > You are printing 10,000 printk messages from an interrupt context??? > And to top it off, I ran this on my box, switching printk() to > trace_printk() (which is extremely low overhead). And it is triggered > on the same CPU that did the printk() itself on. Yeah, there is no hand > off, because you are doing a shitload of printks on one CPU and nothing > on any of the other CPUs. This isn't the problem that my patch was set > out to solve, nor is it a very realistic problem. I added a counter to > the printk as well, to keep track of how many printks there were: The code might suck but I think this does replicate what we've been seeing regularly in the fleet. The console side is pretty slow - IPMI faithfully emulating serial console. I don't know it's doing 115200 or even slower. Please consider something like the following. * The kernel isn't preemptible. Machine runs out of memory, hits OOM condition. It starts printing OOM information. * Netconsole tries to send out OOM messages and tries memory allocation which fails which then prints allocation failed messages. Because this happens while already printing, it just queues the messages to the buffer. This repeats. * We're still in the middle of OOM and hasn't killed nobody, so memory keeps being short and the printk ring buffer is continuously getting filled by the above. Also, after a bit, RCU stall warnings kick in too producing more messages. What's happening is that the OOM killer is trapped flushing printk failing to clear the memory condition and that leads irq / softirq contexts to produce messages faster than can be flushed. I don't see how we'd be able to clear the condition without introducing an independent context to flush the ring buffer. Again, this is an actual problem that we've been seeing fairly regularly in production machines. Thanks. -- tejun