From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758839AbcHaBog (ORCPT ); Tue, 30 Aug 2016 21:44:36 -0400 Received: from mail-pa0-f68.google.com ([209.85.220.68]:34416 "EHLO mail-pa0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752054AbcHaBod (ORCPT ); Tue, 30 Aug 2016 21:44:33 -0400 Date: Wed, 31 Aug 2016 10:44:41 +0900 From: Sergey Senozhatsky To: Andrew Morton Cc: Sergey Senozhatsky , Petr Mladek , Jan Kara , linux-kernel@vger.kernel.org, Sergey Senozhatsky Subject: Re: [PATCH v2] printk/nmi: avoid direct printk()-s from __printk_nmi_flush() Message-ID: <20160831014441.GA472@swordfish> References: <20160830161354.581-1-sergey.senozhatsky@gmail.com> <20160830150315.93efc592aa631f474af760b5@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160830150315.93efc592aa631f474af760b5@linux-foundation.org> User-Agent: Mutt/1.7.0 (2016-08-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On (08/30/16 15:03), Andrew Morton wrote: > > __printk_nmi_flush() can be called from nmi_panic(), therefore it has to > > test whether it's executed in NMI context and thus must route the messages > > through deferred printk() or via direct printk(). > > Why? What misbehaviour does the current code cause? the reasoning behind the `if in_nmi()' in print_nmi_seq_line() if (in_nmi()) printk_deferred("%.*s", (end - start) + 1, buf); else printk("%.*s", (end - start) + 1, buf); was as follows (per Petr's commit message) : In NMI context, printk() messages are stored into per-CPU buffers to : avoid a possible deadlock. They are normally flushed to the main ring : buffer via an IRQ work. But the work is never called when the system : calls panic() in the very same NMI handler. : : This patch tries to flush NMI buffers before the crash dump is : generated. In this case it does not risk a double release and bails out : when the logbuf_lock is already taken. The aim is to get the messages : into the main ring buffer when possible. It makes them better : accessible in the vmcore. : : Then the patch tries to flush the buffers second time when other CPUs : are down. It might be more aggressive and reset logbuf_lock. The aim : is to get the messages available for the consequent kmsg_dump() and : console_flush_on_panic() calls. : : The patch causes vprintk_emit() to be called even in NMI context again. : But it is done via printk_deferred() so that the console handling is : skipped. Consoles use internal locks and we could not prevent a : deadlock easily. They are explicitly called later when the crash dump : is not generated, see console_flush_on_panic(). doing pr_err() and pr_cont() defeats the purpose of print_nmi_seq_line(), because in the worst case it can do something like this: vprintk_emit() /* console_trylock() */ console_unlock() call_console_drivers() foo_write() // possibly locked the tricky moment here is -- if the console semaphore is unlocked and NMI can successfully console_trylock(), then how any of underlying serial console driver's locks can be taken? one possibility is: CPU0 SyS_ioctl do_vfs_ioctl tty_ioctl n_tty_ioctl tty_mode_ioctl set_termios tty_set_termios uart_set_termios uart_change_speed FOO_serial_set_termios spin_lock_irqsave(&port->lock) // lock the output port ... --> NMI nmi_panic() printk_nmi_flush_on_panic() __printk_nmi_flush() pr_cont() /* console_trylock() */ console_unlock() call_console_drivers() foo_write() spin_lock_irqsave(&port->lock) // already locked as far as I can see, tty_ioctl() path does not lock console semaphore, but grabs some tty locks instead. -ss