From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1754266AbdHYBLU (ORCPT <rfc822;w@1wt.eu>);
        Thu, 24 Aug 2017 21:11:20 -0400
Received: from LGEAMRELO13.lge.com ([156.147.23.53]:47269 "EHLO
        lgeamrelo13.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1753880AbdHYBLT (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 24 Aug 2017 21:11:19 -0400
X-Original-SENDERIP: 156.147.1.151
X-Original-MAILFROM: byungchul.park@lge.com
X-Original-SENDERIP: 10.177.222.33
X-Original-MAILFROM: byungchul.park@lge.com
Date: Fri, 25 Aug 2017 10:11:14 +0900
From: Byungchul Park <byungchul.park@lge.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: mingo@kernel.org, tj@kernel.org, boqun.feng@gmail.com,
        david@fromorbit.com, johannes@sipsolutions.net, oleg@redhat.com,
        linux-kernel@vger.kernel.org, kernel-team@lge.com
Subject: Re: [PATCH 4/4] lockdep: Fix workqueue crossrelease annotation
Message-ID: <20170825011114.GA3858@X58A-UD3R>
References: <20170823115843.662056844@infradead.org>
 <20170823121432.990701317@infradead.org>
 <20170824021840.GC6772@X58A-UD3R>
 <20170824140240.t4imrpvussebfimm@hirez.programming.kicks-ass.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20170824140240.t4imrpvussebfimm@hirez.programming.kicks-ass.net>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Thu, Aug 24, 2017 at 04:02:40PM +0200, Peter Zijlstra wrote:
> On Thu, Aug 24, 2017 at 11:18:40AM +0900, Byungchul Park wrote:
> > On Wed, Aug 23, 2017 at 01:58:47PM +0200, Peter Zijlstra wrote:
> 
> > > Also, unconditinoally switching to recursive-read here would fail to
> > > detect the actual deadlock on single-threaded workqueues, which do
> > 
> > Do you mean it's true even in case having fixed lockdep properly?
> > Could you explain why if so? IMHO, I don't think so.
> 
> I'm saying that if lockdep is fixed it should be:
> 
> 	if (wq->saved_max_active == 1 || wq->rescuer) {
> 		lock_map_acquire(wq->lockdep_map);
> 		lock_map_acquire(lockdep_map);
> 	} else {
> 		lock_map_acquire_read(wq->lockdep_map);
> 		lock_map_acquire_read(lockdep_map);
> 	}
> 
> or something like that, because for a single-threaded workqueue, the
> following _IS_ a deadlock:
> 
> 	work-n:
> 		wait_for_completion(C);
> 
> 	work-n+1:
> 		complete(C);
> 
> And that is the only case we now fail to catch.

Thank you for explanation.

> > > +void crossrelease_hist_start(enum xhlock_context_t c, bool force)
> > >  {
> > >  	struct task_struct *cur = current;
> > >  
> > > -	if (cur->xhlocks) {
> > > -		cur->xhlock_idx_hist[c] = cur->xhlock_idx;
> > > -		cur->hist_id_save[c] = cur->hist_id;
> > > +	if (!cur->xhlocks)
> > > +		return;
> > > +
> > > +	/*
> > > +	 * We call this at an invariant point, no current state, no history.
> > > +	 */
> > 
> > This very work-around code _must_ be removed after fixing read-recursive
> > thing in lockdep. I think it would be better to add a tag(comment)
> > saying it.
> > 
> > > +	if (c == XHLOCK_PROC) {
> > > +		/* verified the former, ensure the latter */
> > > +		WARN_ON_ONCE(!force && cur->lockdep_depth);
> > > +		invalidate_xhlock(&xhlock(cur->xhlock_idx));
> > >  	}
> 
> No, this is not a work around, this is fundamentally so. It's not going
> away. The only thing that should go away is the .force argument.

I meant, this seems to be led from your mis-understanding of
crossrelease_hist_{start, end}().

Uer of force == 1 should not exist or don't have to exist. I am sure you
haven't read my replys. Please read the following at least:

https://lkml.org/lkml/2017/8/24/126