From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753040AbbKZIxx (ORCPT ); Thu, 26 Nov 2015 03:53:53 -0500 Received: from mx2.suse.de ([195.135.220.15]:46068 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751968AbbKZIxt (ORCPT ); Thu, 26 Nov 2015 03:53:49 -0500 Date: Thu, 26 Nov 2015 09:53:45 +0100 From: Jan Kara To: Tejun Heo Cc: Jan Kara , Andrew Morton , Calvin Owens , Dave Jones , Kyle McMartin , linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH] printk: do cond_resched() between lines while outputting to consoles Message-ID: <20151126085345.GB9919@quack.suse.cz> References: <20151124213125.GA16368@mtj.duckdns.org> <20151125090522.GK25232@quack.suse.cz> <20151125170217.GA29681@mtj.duckdns.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151125170217.GA29681@mtj.duckdns.org> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On Wed 25-11-15 12:02:17, Tejun Heo wrote: > On Wed, Nov 25, 2015 at 10:05:22AM +0100, Jan Kara wrote: > > So did you particularly have an issue during console registration? Because > > Yeap, we're seeing a small ratio of machines falling head over hills > during IPMI serial console registration. Pumping out the messages > collected prior to registration takes too long triggering softlockup > warning on all forty something CPUs which pile a metric ton of > messages atop. From then on, softlockup / rcu stall warnings repeat > themselves. Some machines recover after >10mins of doing that. The > log is hillarious to look at afterward. OK, then feel free to add my: Acked-by: Jan Kara > > at least our customers mostly have issues during heavy use of ordinary > > printk (e.g. during boot or when hardware gets probed) and your change > > doesn't affect that case. That being said if you really hit a case where > > Hah, that must be a lot of messages being printk'd. Yes, it is. They have ~1000 SCSI devices attached (250 disks, each over 4 paths) and similar stuff. But also doing sysrq-t on a large machine generates enough output to kill the machine... > > your patch helps, then I have no problem with it (you can add my Acked-by). > > > > At Kernel Summit I spoke with Linus and Andrew regarding printk softlockups > > and we ended up with a decision that we decouple queueing into kernel > > ringbuffer from the actual printing into console which would happen from > > kthread / workqueue. Then the lockups would be solved by printing to > > console happening from schedulable context and printk() as such being > > independent from console speed. We only have to have some special cases > > there for crashes so that messages get printed synchronously in that case. > > Yeah, we'd prolly want to make the behavior contingent on the time > taken and so on. At any rate, even with workqueue-deferred dumping, > this patch would still be necessary for non-preemptible kernels; > otherwise, there's no cond_resched() in printing path right now. Yup. Honza -- Jan Kara SUSE Labs, CR