From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753246AbdHXOCt (ORCPT ); Thu, 24 Aug 2017 10:02:49 -0400 Received: from bombadil.infradead.org ([65.50.211.133]:41697 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753075AbdHXOCr (ORCPT ); Thu, 24 Aug 2017 10:02:47 -0400 Date: Thu, 24 Aug 2017 16:02:40 +0200 From: Peter Zijlstra To: Byungchul Park Cc: mingo@kernel.org, tj@kernel.org, boqun.feng@gmail.com, david@fromorbit.com, johannes@sipsolutions.net, oleg@redhat.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH 4/4] lockdep: Fix workqueue crossrelease annotation Message-ID: <20170824140240.t4imrpvussebfimm@hirez.programming.kicks-ass.net> References: <20170823115843.662056844@infradead.org> <20170823121432.990701317@infradead.org> <20170824021840.GC6772@X58A-UD3R> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170824021840.GC6772@X58A-UD3R> User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 24, 2017 at 11:18:40AM +0900, Byungchul Park wrote: > On Wed, Aug 23, 2017 at 01:58:47PM +0200, Peter Zijlstra wrote: > > Also, unconditinoally switching to recursive-read here would fail to > > detect the actual deadlock on single-threaded workqueues, which do > > Do you mean it's true even in case having fixed lockdep properly? > Could you explain why if so? IMHO, I don't think so. I'm saying that if lockdep is fixed it should be: if (wq->saved_max_active == 1 || wq->rescuer) { lock_map_acquire(wq->lockdep_map); lock_map_acquire(lockdep_map); } else { lock_map_acquire_read(wq->lockdep_map); lock_map_acquire_read(lockdep_map); } or something like that, because for a single-threaded workqueue, the following _IS_ a deadlock: work-n: wait_for_completion(C); work-n+1: complete(C); And that is the only case we now fail to catch. > > +void crossrelease_hist_start(enum xhlock_context_t c, bool force) > > { > > struct task_struct *cur = current; > > > > - if (cur->xhlocks) { > > - cur->xhlock_idx_hist[c] = cur->xhlock_idx; > > - cur->hist_id_save[c] = cur->hist_id; > > + if (!cur->xhlocks) > > + return; > > + > > + /* > > + * We call this at an invariant point, no current state, no history. > > + */ > > This very work-around code _must_ be removed after fixing read-recursive > thing in lockdep. I think it would be better to add a tag(comment) > saying it. > > > + if (c == XHLOCK_PROC) { > > + /* verified the former, ensure the latter */ > > + WARN_ON_ONCE(!force && cur->lockdep_depth); > > + invalidate_xhlock(&xhlock(cur->xhlock_idx)); > > } No, this is not a work around, this is fundamentally so. It's not going away. The only thing that should go away is the .force argument.