From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752158AbdBCCpa (ORCPT ); Thu, 2 Feb 2017 21:45:30 -0500 Received: from mail-pf0-f196.google.com ([209.85.192.196]:34561 "EHLO mail-pf0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751771AbdBCCp2 (ORCPT ); Thu, 2 Feb 2017 21:45:28 -0500 Date: Fri, 3 Feb 2017 11:45:43 +0900 From: Sergey Senozhatsky To: Petr Mladek Cc: Sergey Senozhatsky , Peter Zijlstra , Jan Kara , Ross Zwisler , Sergey Senozhatsky , Ross Zwisler , Andrew Morton , Linus Torvalds , Tejun Heo , Calvin Owens , Steven Rostedt , Ingo Molnar , Andy Lutomirski , Peter Hurley , LKML Subject: Re: [PATCHv7 6/8] printk: use printk_safe buffers in printk Message-ID: <20170203024543.GD6228@jagdpanzerIV.localdomain> References: <20161227141611.940-1-sergey.senozhatsky@gmail.com> <20161227141611.940-7-sergey.senozhatsky@gmail.com> <20170201090625.GC11567@quack2.suse.cz> <20170201093739.GT6515@twins.programming.kicks-ass.net> <20170201153910.GL6620@pathway.suse.cz> <20170202021134.GC1954@jagdpanzerIV.localdomain> <20170202090722.GW6515@twins.programming.kicks-ass.net> <20170202100348.GA364@jagdpanzerIV.localdomain> <20170202152002.GE23754@pathway.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170202152002.GE23754@pathway.suse.cz> User-Agent: Mutt/1.7.2 (2016-11-26) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On (02/02/17 16:20), Petr Mladek wrote: > > well, I wouldn't say that printk_deferred() has less chances. I see your > > point, of course. but with printk_deferred() we, at least, will have messages > > in logbuf (or printk_safe buffers), so they can appear in crash dump, for > > instance. that "later" part can be sysrq, for example, or panic->flush_on_panic(), > > etc. if "normal" printk->queue irq_work doesn't work. > > > > needless to say, that in this particular case (WARN from sched), if the > > first printk() out of N printk()-s, which sched core calls to dump_stack(), > > deadlocks, then we got nothing to print/dump. > > An always deferred printk() or another deferred ways are future work. > We should try to find a good solution, definitely. > > The question is what to do with this patch. We need to change things > step by step. The printk_safe patchset is one of them and looks > almost ready. > > The lockdep warnings are correct and help to find locations where > scheduler warnings might cause a deadlock. I like that lockdep warning. and looking at it... I think lockdep does not add any additional risks. we are in deadlock risky sched->printk condition due to WARN from sched, not lockdep. the lockdep warning that we see happens after we switch to printk_safe mode. please see console_trylock()->__down_trylock_console_sem() static int __down_trylock_console_sem(unsigned long ip) { ... 224 printk_safe_enter_irqsave(flags); 225 lock_failed = down_trylock(&console_sem); << print_circular_bug() comes from here 226 printk_safe_exit_irqrestore(flags); ... } so the unsafe/safe printk 'map' should be as follows [ 13.090679] Call Trace: [ 13.090680] dump_stack+0x86/0xc3 [ 13.090680] print_circular_bug+0x1be/0x210 << still in printk_safe [ 13.090680] __lock_acquire+0x10e5/0x1270 [ 13.090681] lock_acquire+0xfd/0x200 [ 13.090681] ? down_trylock+0x14/0x40 [ 13.090681] _raw_spin_lock_irqsave+0x59/0x93 [ 13.090681] ? down_trylock+0x14/0x40 [ 13.090682] ? vprintk_emit+0x2c7/0x3a0 [ 13.090682] down_trylock+0x14/0x40 [ 13.090682] __down_trylock_console_sem+0x3c/0xc0 << we are in printk_safe now (!) [ 13.090683] console_trylock+0x16/0x90 [ 13.090683] ? trace_hardirqs_off+0xd/0x10 [ 13.090683] vprintk_emit+0x2c7/0x3a0 [ 13.090684] ? update_load_avg+0x85b/0xb80 [ 13.090684] vprintk_default+0x29/0x50 [ 13.090684] vprintk_func+0x25/0x80 << we are in unsafe printk here (!) [ 13.090684] printk+0x52/0x6e [ 13.090685] ? update_load_avg+0x85b/0xb80 [ 13.090685] __warn+0x39/0xf0 [ 13.090685] warn_slowpath_fmt+0x5f/0x80 [ 13.090686] update_load_avg+0x85b/0xb80 [ 13.090686] ? debug_smp_processor_id+0x17/0x20 [ 13.090686] detach_task_cfs_rq+0x3f/0x210 [ 13.090687] task_change_group_fair+0x24/0x100 [ 13.090687] sched_change_group+0x5f/0x110 [ 13.090687] sched_move_task+0x53/0x160 [ 13.090687] cpu_cgroup_attach+0x36/0x70 [ 13.090688] cgroup_migrate_execute+0x230/0x3f0 [ 13.090688] cgroup_migrate+0xce/0x140 [ 13.090688] ? cgroup_migrate+0x5/0x140 [ 13.090689] cgroup_attach_task+0x27f/0x3e0 [ 13.090689] ? cgroup_attach_task+0x9b/0x3e0 [ 13.090689] __cgroup_procs_write+0x30e/0x510 [ 13.090690] ? __cgroup_procs_write+0x70/0x510 [ 13.090690] cgroup_procs_write+0x14/0x20 [ 13.090690] cgroup_file_write+0x44/0x1e0 [ 13.090690] kernfs_fop_write+0x13c/0x1c0 [ 13.090691] __vfs_write+0x37/0x160 [ 13.090691] ? rcu_read_lock_sched_held+0x4a/0x80 [ 13.090691] ? rcu_sync_lockdep_assert+0x2f/0x60 [ 13.090692] ? __sb_start_write+0x10d/0x220 [ 13.090692] ? vfs_write+0x19b/0x1f0 [ 13.090692] ? security_file_permission+0x3b/0xc0 [ 13.090693] vfs_write+0xcb/0x1f0 [ 13.090693] SyS_write+0x58/0xc0 [ 13.090693] entry_SYSCALL_64_fastpath+0x1f/0xc2 that unsafe console_trylock() is not caused by lockdep. yes, we can deadlock in down_trylock(), but lockdep is not the root cause. and if we will disable lockdep, sched->printk->console_trylock() still will have pretty much same chances to deadlock. let me know if I'm missing something. > One solution would be to keep lockdep as is in this patch. It means > to hide existing risk until we have some reasonable printk_deferred() > solution. well, yes. this is still a possible way to go (until the deferred printk()). > Another solution would to keep this patch as is and implement > WARN*_DEFERRED() variants that would either use > printk_safe_enter()/exit() as the currently usable deferred and > lockless solution. Or they could just disable lockdep and hide > the report for now. These deferred variants should be > used on all locations reported by lockdep where we want to accept the > risk. We will at least know where the potential risk is and could find > a proper solution later. WARN*_DEFERRED() looks to me like almost unmaintainable thing. too much work; a never ending work. > Note that I do not like hiding problems but they were hidden before this > patchset as well. I am just looking for the best way forward. sure. -ss