From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1751172AbdHXCSo (ORCPT <rfc822;w@1wt.eu>);
        Wed, 23 Aug 2017 22:18:44 -0400
Received: from LGEAMRELO12.lge.com ([156.147.23.52]:51306 "EHLO
        lgeamrelo12.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1751061AbdHXCSn (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 23 Aug 2017 22:18:43 -0400
X-Original-SENDERIP: 156.147.1.126
X-Original-MAILFROM: byungchul.park@lge.com
X-Original-SENDERIP: 10.177.222.33
X-Original-MAILFROM: byungchul.park@lge.com
Date: Thu, 24 Aug 2017 11:18:40 +0900
From: Byungchul Park <byungchul.park@lge.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: mingo@kernel.org, tj@kernel.org, boqun.feng@gmail.com,
        david@fromorbit.com, johannes@sipsolutions.net, oleg@redhat.com,
        linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/4] lockdep: Fix workqueue crossrelease annotation
Message-ID: <20170824021840.GC6772@X58A-UD3R>
References: <20170823115843.662056844@infradead.org>
 <20170823121432.990701317@infradead.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20170823121432.990701317@infradead.org>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, Aug 23, 2017 at 01:58:47PM +0200, Peter Zijlstra wrote:
> The new completion/crossrelease annotations interact unfavourable with
> the extant flush_work()/flush_workqueue() annotations.
> 
> The problem is that when a single work class does:
> 
>   wait_for_completion(&C)
> 
> and
> 
>   complete(&C)
> 
> in different executions, we'll build dependencies like:
> 
>   lock_map_acquire(W)
>   complete_acquire(C)
> 
> and
> 
>   lock_map_acquire(W)
>   complete_release(C)
> 
> which results in the dependency chain: W->C->W, which lockdep thinks
> spells deadlock, even though there is no deadlock potential since
> works are ran concurrently.
> 
> One possibility would be to change the work 'lock' to recursive-read,

I'm not sure if this solve the issue perfectly, but anyway it should be
a recursive version after fixing lockdep, regardless of the issue.

> but that would mean hitting a lockdep limitation on recursive locks.

Fo now, work-around might be needed.

> Also, unconditinoally switching to recursive-read here would fail to
> detect the actual deadlock on single-threaded workqueues, which do

Do you mean it's true even in case having fixed lockdep properly?
Could you explain why if so? IMHO, I don't think so.

> @@ -4751,15 +4751,31 @@ static inline void invalidate_xhlock(str
>   * The same is true for system-calls, once a system call is completed (we've
>   * returned to userspace) the next system call does not depend on the lock
>   * history of the previous system call.
> + *
> + * They key property for independence, this invariant state, is that it must be
> + * a point where we hold no locks and have no history. Because if we were to
> + * hold locks, the restore at _end() would not necessarily recover it's history
> + * entry. Similarly, independence per-definition means it does not depend on
> + * prior state.
>   */
> -void crossrelease_hist_start(enum xhlock_context_t c)
> +void crossrelease_hist_start(enum xhlock_context_t c, bool force)
>  {
>  	struct task_struct *cur = current;
>  
> -	if (cur->xhlocks) {
> -		cur->xhlock_idx_hist[c] = cur->xhlock_idx;
> -		cur->hist_id_save[c] = cur->hist_id;
> +	if (!cur->xhlocks)
> +		return;
> +
> +	/*
> +	 * We call this at an invariant point, no current state, no history.
> +	 */

This very work-around code _must_ be removed after fixing read-recursive
thing in lockdep. I think it would be better to add a tag(comment)
saying it.

> +	if (c == XHLOCK_PROC) {
> +		/* verified the former, ensure the latter */
> +		WARN_ON_ONCE(!force && cur->lockdep_depth);
> +		invalidate_xhlock(&xhlock(cur->xhlock_idx));
>  	}
> +
> +	cur->xhlock_idx_hist[c] = cur->xhlock_idx;
> +	cur->hist_id_save[c]    = cur->hist_id;
>  }
>  
>  void crossrelease_hist_end(enum xhlock_context_t c)