From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752756AbcJ1EFp (ORCPT ); Fri, 28 Oct 2016 00:05:45 -0400 Received: from mail-pf0-f194.google.com ([209.85.192.194]:36261 "EHLO mail-pf0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751072AbcJ1EFn (ORCPT ); Fri, 28 Oct 2016 00:05:43 -0400 Date: Fri, 28 Oct 2016 13:05:39 +0900 From: Sergey Senozhatsky To: Linus Torvalds Cc: Sergey Senozhatsky , Petr Mladek , Andrew Morton , Jan Kara , Tejun Heo , Calvin Owens , Thomas Gleixner , Mel Gorman , Steven Rostedt , Ingo Molnar , Peter Zijlstra , Laura Abbott , Andy Lutomirski , Kees Cook , Linux Kernel Mailing List , Sergey Senozhatsky Subject: Re: [RFC][PATCHv4 0/6] printk: use printk_safe to handle printk() recursive calls Message-ID: <20161028040539.GA612@swordfish> References: <20161027154933.1211-1-sergey.senozhatsky@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.7.1 (2016-10-04) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On (10/27/16 20:30), Linus Torvalds wrote: > On Thu, Oct 27, 2016 at 8:49 AM, Sergey Senozhatsky > wrote: > > > > RFC > > > > This patch set extends a lock-less NMI per-cpu buffers idea to > > handle recursive printk() calls. The basic mechanism is pretty much the > > same -- at the beginning of a deadlock-prone section we switch to lock-less > > printk callback, and return back to a default printk implementation at the > > end; the messages are getting flushed to a logbuf buffer from a safer > > context. > > This looks very reasonable to me. > > Does this also obviate the need for "printk_deferred()" that the > scheduler and the clock code uses? Because that would be a lovely > thing to look at if it doesn't.. I wish I could say that we can retire printk_deferred(), but no, we still need it. it's rather simple to fix printk recursion (that's what the patch set is doing), but printk deadlocks are much harder to handle. anything that starts somewhere else but somehow is related printk will deadlock (in the worst case). I use this backtrace as an example: SyS_ioctl do_vfs_ioctl tty_ioctl n_tty_ioctl tty_mode_ioctl set_termios tty_set_termios uart_set_termios uart_change_speed FOO_serial_set_termios spin_lock_irqsave(&port->lock) // lock the output port .... !! WARN() or pr_err() or printk() vprintk_emit() /* console_trylock() */ console_unlock() call_console_drivers() FOO_write() spin_lock_irqsave(&port->lock) // already locked with the current printk we can't tell for sure how many locks will be acquired -- printk() can succeed in locking the console_sem and start invoking console drivers (if any) from console_unlock(), or it can fail thus we will acquire only logbuf spin_lock and console_sem spin_lock. the things can change *a bit* once we switch to async_printk. because instead of doing console_unlock()->call_console_drivers(), printk() will just wake_up() the printk_kthread. but still, it won't be enough to remove printk_deferred() :( vprintk_emit() wake_up() spin_lock rq lock printk will be safe. but wake_up() spin_lock rq lock printk vprintk_emit() wake_up() spin_lock rq lock will deadlock. we can't even tell for sure what locks are "important" to printk(). a small and reasonable code refactoring somewhere in clock code/etc. can accidentally change the whole picture by introducing "unsafe" WARN_ON() or adding yet another lock to the printing path. need to think more. p.s. we are plannig to discuss printk related issues next week in Santa Fe. -ss