From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753395AbcJDOwZ (ORCPT ); Tue, 4 Oct 2016 10:52:25 -0400 Received: from mx2.suse.de ([195.135.220.15]:60802 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751125AbcJDOwY (ORCPT ); Tue, 4 Oct 2016 10:52:24 -0400 Date: Tue, 4 Oct 2016 16:52:21 +0200 From: Petr Mladek To: Sergey Senozhatsky Cc: Sergey Senozhatsky , Jan Kara , Andrew Morton , Tejun Heo , Calvin Owens , linux-kernel@vger.kernel.org Subject: Re: [RFC][PATCH 6/7] printk: use alternative printk buffers Message-ID: <20161004145221.GF13369@pathway.suse.cz> References: <20160927142237.5539-1-sergey.senozhatsky@gmail.com> <20160927142237.5539-7-sergey.senozhatsky@gmail.com> <20160929130000.GE26796@pathway.suse.cz> <20160930011544.GC547@swordfish> <20160930111546.GI26796@pathway.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160930111546.GI26796@pathway.suse.cz> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri 2016-09-30 13:15:46, Petr Mladek wrote: > On Fri 2016-09-30 10:15:44, Sergey Senozhatsky wrote: > > On (09/29/16 15:00), Petr Mladek wrote: > > [..] > > > > @@ -1791,7 +1791,7 @@ asmlinkage int vprintk_emit(int facility, int level, > > > > zap_locks(); > > > > } > > > > > > > > - lockdep_off(); > > > > + alt_printk_enter(); > > > > > > IMHO, we could not longer enter vprintk_emit() recursively. The same > > > section that was guarded by logbuf_cpu is guarded by > > > alt_printk_enter()/exit() now. > > > > you might be very right here. I'll take a look. > > > > > IMHO, we could remove all the logic around the recursion. Then we > > > could even disable/enable irqs inside alt_printk_enter()/exit(). > > > > I was thinking of doing something like this; but that would require > > storing 'unsigned long' flags in per-cpu data > > > > alt_enter() > > { > > unsinged long flags; > > > > local_irq_save(flags); > > ctx = this_cpu_ptr(); > > ctx->flags = flags; > > ... > > } > > > > alt_exit() > > { > > ctx = this_cpu_ptr(); > > ... > > local_irq_restore(ctx->flags); > > } > > > > and the decision was to keep `unsigned long flags' on stack in the > > alt_enter/exit caller. besides in most of the cases we already have > > it (in vprintk_emit() and console_unlock()). > > I would pass the pointer to flags as alt_enter() parameter. > > > > but I can certainly hide these details in alt_enter/exit. > > > > > > @@ -2479,7 +2490,9 @@ void console_unlock(void) > > > > */ > > > > raw_spin_lock(&logbuf_lock); > > > > retry = console_seq != log_next_seq; > > > > - raw_spin_unlock_irqrestore(&logbuf_lock, flags); > > > > + raw_spin_unlock(&logbuf_lock); > > > > + alt_printk_exit(); > > > > + local_irq_restore(flags); > > > > > > We should mention that this patch makes an obsolete artefact from > > > printk_deferred(). It opens the door for another big cleanup and > > > relief. > > > > do you mean that, once alt_printk is done properly, we can drop > > printk_deferred()? I was thinking of it, but decided not to > > mention/touch it in this patch set. > > My understanding is the following: > > The difference between normal printk() and printk_deferred() is > that the other does not call console_trylock()/console_unlock(). > It means that printk_deferred() can avoid recursion only from these > two calls. > > printk_deferred() is used only in scheduler and timekeeping code. > Therefore it prevents only limited number of possible recursions > and deadlocks at the moment. > > This patch guards most of the two calls a more generic way. > The redirected parts prevent recursion not only to into the > code guarded by console_sem but also into parts guarded > by lockbuf_lock. > > By other words, this patch is supposed to handle a superset > of the deadlocks that are currently prevented by printk_deferred(). > If this is true, we do not longer need printk_deferred(). > > The only question is if this patch guards enough parts of > console_try_lock()/console_unlock() to handle the superset > of the possible deadlocks. > > I see that it does not guard two up_console_sem() calls > from console_unlock(). But this can be fixed in the next > version. > > Or is there any other catch that I do not see at the moment? And there is :-( The above logic looked at the problem only from one side. It was about errors starting from the printk() code itself, for example: vprintk_emit() console_unlock() up() << raw_spin_lock_irqsave(&sem->lock, flags); wake_up_process() try_to_wake_up() ttwu_queue() ttwu_activate() activate_task() enqueue_task() enqueue_task_fair() cfs_rq_of() task_of() WARN_ON_ONCE(!entity_is_task(se)) vprintk_emit() console_trylock() down_trylock() raw_spin_lock_irqsave(&sem->lock, flags) ^^^^ deadlock But it does no solve errors starting in the scheduler or timekeeping code. I mean: any_function_waking_process() wake_up_process() try_to_wake_up() // takes &p->pi_lock ttwu_queue() ttwu_activate() activate_task() enqueue_task() enqueue_task_fair() cfs_rq_of() task_of() WARN_ON_ONCE(!entity_is_task(se)) vprintk_emit() console_trylock() // success console_unlock() up_console_sem() up() __up() wake_up_process() try_to_wake_up() raw_spin_lock_irqsave(&p->pi_lock, flags); ^^^^ deadlock The only thing that might help here is to call alt_printk_enter()/exit() in wake_up_process() itself. Otherwise, we still would need to keep the printk_deferred() stuff. By other words, we might need to put alt_printk_enter()/exit() into the scheduler and timekeeping code. In theory it might be easier to maintain than the separated printk_deferred() calls. But there might be some catches because we need to disable the interrupts, ... Sigh, this 2nd scenario is much more likely than the 1st one. I guess that warnings in the scheduler/timekeeping code will be triggered outside printk() most of the time. It means that this approach might be much harder to sell after all :-( Best Regards, Petr