From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751322AbcA1XIZ (ORCPT ); Thu, 28 Jan 2016 18:08:25 -0500 Received: from mail-pa0-f43.google.com ([209.85.220.43]:36054 "EHLO mail-pa0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751068AbcA1XIW (ORCPT ); Thu, 28 Jan 2016 18:08:22 -0500 Subject: Re: [PATCH v4] lib/spinlock_debug.c: prevent a recursive cycle in the debug code To: Sergey Senozhatsky References: <1453896061-14102-1-git-send-email-byungchul.park@lge.com> <20160128014253.GC1538@X58A-UD3R> <20160128023750.GB1834@swordfish> <000301d15985$7f416690$7dc433b0$@lge.com> <20160128060530.GC1834@swordfish> <20160128081313.GB31266@X58A-UD3R> <20160128104137.GA610@swordfish> <20160128105342.GB610@swordfish> <20160128154257.GA564@swordfish> Cc: Byungchul Park , akpm@linux-foundation.org, mingo@kernel.org, linux-kernel@vger.kernel.org, akinobu.mita@gmail.com, jack@suse.cz, torvalds@linux-foundation.org, Sergey Senozhatsky From: Peter Hurley Message-ID: <56AA9F63.9070600@hurleysoftware.com> Date: Thu, 28 Jan 2016 15:08:19 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <20160128154257.GA564@swordfish> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/28/2016 07:42 AM, Sergey Senozhatsky wrote: > On (01/28/16 19:53), Sergey Senozhatsky wrote: >>> ah... silly me... you mean the first CPU that triggers the spin_dump() will >> ^^^ this, of course, is true for >> console_sem->lock and logbuf_lock >> only. >> >>> deadlock itself, so the rest of CPUs will see endless recursive >>> spin_lock()->spin_dump()->spin_lock()->spin_dump() calls? > [..] >>> Can you please update your bug description in the commit message? >>> It's the deadlock that is causing the recursion on other CPUs in the >>> first place. > > no, don't update anything. I was completely wrong. it's not a deadlock > that is the root cause here. > > even if at some level of recursion (nested printk calls) > spin_dump()->__spin_lock_debug()->arch_spin_trylock() acquires the > lock, it returns back with the spin lock unlocked anyway. > > vprintk_emit() > console_trylock() > spin_lock() > spin_dump() > vprintk_emit() > console_trylock() > spin_lock() > spin_dump() > vprintk_emit() > console_trylock() > spin_lock() << OK, got the lock finally The problem is you have postulated a very shallow recursion. This looks much worse if this happens 1000 times, and probably won't recover to output anything. Additionally, what if the console_sem is simply corrupted? A livelock with no output ever is not very helpful. As I wrote earlier, I don't think this is the way to fix recursion problems with printk() [by eliding output]. Rather, a way to effectively determine a recursion is in progress, and _at a minimum_ guaranteeing that the recursive output will eventually be output should be the goal. Including dumb recursion like a console driver printing an error :/ Then, lockdep could remain enabled while calling console drivers. Regards, Peter Hurley > sem->count-- > spin_unlock() << unlock, return > arch_spin_lock() << got the lock, return > sem->count-- > spin_unlock() << unlock, return > arch_spin_lock() << got the lock, return > sem->count-- > spin_unlock() << unlock, return > > > ...um > > >> But I found there's a possiblity in the debug code *itself* to cause a >> lockup. > > please explain. > > -ss >