From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932166AbaDHO1b (ORCPT ); Tue, 8 Apr 2014 10:27:31 -0400 Received: from cantor2.suse.de ([195.135.220.15]:34701 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757275AbaDHO13 (ORCPT ); Tue, 8 Apr 2014 10:27:29 -0400 Date: Tue, 8 Apr 2014 16:27:26 +0200 From: Jan Kara To: Andrew Morton Cc: LKML , pmladek@suse.cz, Frederic Weisbecker , Steven Rostedt , Jan Kara Subject: Re: [PATCH 0/8 v4] printk: Cleanups and softlockup avoidance Message-ID: <20140408142726.GB9551@quack.suse.cz> References: <1395770101-24534-1-git-send-email-jack@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1395770101-24534-1-git-send-email-jack@suse.cz> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 25-03-14 18:54:53, Jan Kara wrote: > Hello, > > this is another revision of the printk softlockup series. > > Changes since v3: > Fixed bogus warning in console_try_lock_spin() in non-preemptible kernels. > Fixed infinite loop in console_flush() when console was suspended. > > Changes since v2: > I have fixed up some small problems pointed out by Andrew, added possibility to > configure out the printk offloading logic (for small systems), and offload > kthreads are now started only once printk.offload_chars is set to value > 0. > > Intro for the newcomers to the series below. Ping Andrew? Honza > > --- > > Currently, console_unlock() prints messages from kernel printk buffer to > console while the buffer is non-empty. When serial console is attached, > printing is slow and thus other CPUs in the system have plenty of time > to append new messages to the buffer while one CPU is printing. Thus the > CPU can spend unbounded amount of time doing printing in console_unlock(). > This is especially serious since vprintk_emit() calls console_unlock() > with interrupts disabled. > > In practice users have observed a CPU can spend tens of seconds printing > in console_unlock() (usually during boot when hundreds of SCSI devices > are discovered) resulting in RCU stalls (CPU doing printing doesn't > reach quiescent state for a long time), softlockup reports (IPIs for the > printing CPU don't get served and thus other CPUs are spinning waiting > for the printing CPU to process IPIs), and eventually a machine death > (as messages from stalls and lockups append to printk buffer faster than > we are able to print). So these machines are unable to boot with serial > console attached. Also during artificial stress testing SATA disk > disappears from the system because its interrupts aren't served for too > long. > > This is a revised series using my new approach to the problem which doesn't > let CPU out of console_unlock() until there's someone else to take over the > printing. The main difference since the last version is that instead of > passing printing duty to different CPUs via IPIs we use dedicated kthreads. > This method is somewhat less reliable (in a sense that there are more > situations in which handover needn't work at all - e.g. when the currently > printing CPU holds a spinlock and the CPU where kthread is scheduled to run is > spinning on this spinlock) but the code is much simpler and in my practical > testing kthread approach was good enough to avoid any problems (with one > exception - see patch 8/8). > > Honza -- Jan Kara SUSE Labs, CR