From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752784AbbKYRC1 (ORCPT ); Wed, 25 Nov 2015 12:02:27 -0500 Received: from mail-yk0-f173.google.com ([209.85.160.173]:34698 "EHLO mail-yk0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752608AbbKYRCW (ORCPT ); Wed, 25 Nov 2015 12:02:22 -0500 Date: Wed, 25 Nov 2015 12:02:17 -0500 From: Tejun Heo To: Jan Kara Cc: Andrew Morton , Calvin Owens , Dave Jones , Kyle McMartin , linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH] printk: do cond_resched() between lines while outputting to consoles Message-ID: <20151125170217.GA29681@mtj.duckdns.org> References: <20151124213125.GA16368@mtj.duckdns.org> <20151125090522.GK25232@quack.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151125090522.GK25232@quack.suse.cz> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, Jan. On Wed, Nov 25, 2015 at 10:05:22AM +0100, Jan Kara wrote: > So did you particularly have an issue during console registration? Because Yeap, we're seeing a small ratio of machines falling head over hills during IPMI serial console registration. Pumping out the messages collected prior to registration takes too long triggering softlockup warning on all forty something CPUs which pile a metric ton of messages atop. From then on, softlockup / rcu stall warnings repeat themselves. Some machines recover after >10mins of doing that. The log is hillarious to look at afterward. > at least our customers mostly have issues during heavy use of ordinary > printk (e.g. during boot or when hardware gets probed) and your change > doesn't affect that case. That being said if you really hit a case where Hah, that must be a lot of messages being printk'd. > your patch helps, then I have no problem with it (you can add my Acked-by). > > At Kernel Summit I spoke with Linus and Andrew regarding printk softlockups > and we ended up with a decision that we decouple queueing into kernel > ringbuffer from the actual printing into console which would happen from > kthread / workqueue. Then the lockups would be solved by printing to > console happening from schedulable context and printk() as such being > independent from console speed. We only have to have some special cases > there for crashes so that messages get printed synchronously in that case. Yeah, we'd prolly want to make the behavior contingent on the time taken and so on. At any rate, even with workqueue-deferred dumping, this patch would still be necessary for non-preemptible kernels; otherwise, there's no cond_resched() in printing path right now. Thanks. -- tejun