linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] workqueue lockdep limitations/bugs
@ 2018-08-21 12:03 Johannes Berg
  2018-08-21 12:03 ` [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync() Johannes Berg
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Johannes Berg @ 2018-08-21 12:03 UTC (permalink / raw)
  To: Tejun Heo, Lai Jiangshan; +Cc: linux-kernel, linux-wireless


Hi Tejun, Lia, all,

While doing some unrelated testing earlier, I found a bug in workqueue
lockdep, and while looking into the code I found another bug, and another
limitation. I'll follow up with patches for the two bugs, and explain all
three issues here.


1)
Lockdep report is below - the problem is the following: cancel_work_sync()
calls flush_work(), which may or may not do something, depending on whether
or not it's running. This is reasonable for semantics of the workqueue, but
it's not fine for lockdep, and also the reason for issue 3 below.

The lockdep report I hit basically says:
 * you're flushing (but really I'm doing cancel_work_sync) the work
   item, so we take the workqueue lockdep map (it's an ordered
   workqueue)
 * the workqueue has other work items that can be on it, that take locks
   that we already hold (at least indirectly via a dependency chain)
 * thus a locking violation occurred - the other work might be pending
   or running, so you have to wait for it and get a deadlock

Note how this isn't true - we really call cancel_work_sync(), so we don't
need to wait for any other work to run. If this one's running, it must be
the first one running on the workqueue (otherwise we'd just have cancelled
it and it wouldn't be running). Thus, the code is fine, but because
cancel_work_sync() just uses flush_work(), the annotation isn't quite
right. I've fixed that in the follow-up patch by making the annotation
depend on where we came from, but I'm not entirely sure that's the right
thing to do.

2)
The second issue is that we never really use work->lockdep_map when we
flush the work, and thus if we have code like

work_function
{
	mutex_lock(&mutex);
}

other_function
{
	mutex_lock(&mutex);
	flush_work(&work);
}

we never actually get a lockdep report. I've fixed this by putting a lock
map acquire/release in the (hopefully) correct place.

3)
The third problem is more complicated, but it means that we likely miss a
lot of potential locking problems with ordered workqueues, and I think we
should really try to fix this somehow.

Let's say we again have an ordered workqueue, and the following:

work1_function
{
	mutex_lock(&mutex);
}

work2_function
{
	// empty
}

other_function
{
	queue_work_on(&ordered_wq, &work1);
	queue_work_on(&ordered_wq, &work2);
	flush_workqueue(&ordered_wq);
	mutex_lock(&mutex);
	flush_work(&work2);
}

As far as I can tell, this will not generate a lockdep report, because the
flush_work() will not find any pwq in start_flush_work(), and thus will do
"goto already_gone" without ever doing anything with lockdep. After my
second patch, at least it will still acquire and release the
work2.lockdep_map, but that doesn't create any dependency with the whole
workqueue.

I realize that this is difficult since the work struct isn't tied to the
workqueue while it's not running, and it could possibly be used with
different workqueues, but I think we need to solve this too, somehow.

Perhaps we can give each work struct *two* lockdep maps? One that includes
the workqueue dependency, and one that doesn't? Then we could create a
dependency when scheduling:

  schedule_work_on():
    lock_map_acquire(work->wq_lockdep_map);
    lock_map_acquire(wq->lockdep_map);

and then in flush we take both:
    lock_map_acquire(work->lockdep_map);
    lock_map_acquire(work->wq_lockdep_map);

(perhaps also when running the work).

Then, the above scenario should create the following relevant lockdep chain:

mutex			--> work2->wq_lockdep_map
work2->wq_lockdep_map	--> wq->lockdep_map
wq->lockdep_map		--> work->wq_lockdep_map
work->wq_lockdep_map	--> mutex

I think?

johannes


----
lockdep report for problem 1

======================================================
WARNING: possible circular locking dependency detected
4.18.0-rc8+ #1573 Not tainted
------------------------------------------------------
wpa_supplicant/708 is trying to acquire lock:
00000000a1e36a54 ((wq_completion)"%s"wiphy_name(local->hw.wiphy)){+.+.}, at: flush_work+0x175/0x280

but task is already holding lock:
00000000056b83e2 (&local->sta_mtx){+.+.}, at: __sta_info_flush+0x94/0x190

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #3 (&local->sta_mtx){+.+.}:
       __mutex_lock+0x8e/0x9c0
       ieee80211_del_key+0x40/0x1e0
       nl80211_del_key+0x126/0x450
       genl_family_rcv_msg+0x2aa/0x370
       genl_rcv_msg+0x47/0x90
       netlink_rcv_skb+0x4a/0x110
       genl_rcv+0x24/0x40
       netlink_unicast+0x186/0x230
       netlink_sendmsg+0x3b7/0x3e0
       sock_sendmsg+0x16/0x20
       ___sys_sendmsg+0x2d5/0x2f0
       __sys_sendmsg+0x51/0x90
       do_syscall_64+0x59/0x390
       entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #2 (&wdev->mtx){+.+.}:
       __mutex_lock+0x8e/0x9c0
       ieee80211_sta_rx_queued_mgmt+0x4b/0x3e0
       ieee80211_iface_work+0x21f/0x340
       process_one_work+0x28f/0x700
       worker_thread+0x39/0x3f0
       kthread+0x116/0x130
       ret_from_fork+0x3a/0x50

-> #1 ((work_completion)(&sdata->work)){+.+.}:
       process_one_work+0x262/0x700
       worker_thread+0x39/0x3f0
       kthread+0x116/0x130
       ret_from_fork+0x3a/0x50

-> #0 ((wq_completion)"%s"wiphy_name(local->hw.wiphy)){+.+.}:
       lock_acquire+0xa3/0x230
       flush_work+0x195/0x280
       __cancel_work_timer+0x137/0x1e0
       ieee80211_sta_tear_down_BA_sessions+0x8d/0x150
       __sta_info_destroy_part1+0x5d/0x9c0
       __sta_info_flush+0xd8/0x190
       ieee80211_set_disassoc+0xd2/0x7d0
       ieee80211_mgd_auth+0x200/0x340
       cfg80211_mlme_auth+0x152/0x440
       nl80211_authenticate+0x2fd/0x360
       genl_family_rcv_msg+0x2aa/0x370
       genl_rcv_msg+0x47/0x90
       netlink_rcv_skb+0x4a/0x110
       genl_rcv+0x24/0x40
       netlink_unicast+0x186/0x230
       netlink_sendmsg+0x3b7/0x3e0
       sock_sendmsg+0x16/0x20
       ___sys_sendmsg+0x2d5/0x2f0
       __sys_sendmsg+0x51/0x90
       do_syscall_64+0x59/0x390
       entry_SYSCALL_64_after_hwframe+0x49/0xbe

other info that might help us debug this:

Chain exists of:
  (wq_completion)"%s"wiphy_name(local->hw.wiphy) --> &wdev->mtx --> &local->sta_mtx

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&local->sta_mtx);
                               lock(&wdev->mtx);
                               lock(&local->sta_mtx);
  lock((wq_completion)"%s"wiphy_name(local->hw.wiphy));

 *** DEADLOCK ***

5 locks held by wpa_supplicant/708:
 #0: 00000000a50eb6ff (cb_lock){++++}, at: genl_rcv+0x15/0x40
 #1: 000000008954a22a (genl_mutex){+.+.}, at: genl_rcv_msg+0x7a/0x90
 #2: 000000007622d582 (rtnl_mutex){+.+.}, at: nl80211_pre_doit+0xe1/0x190
 #3: 000000004cd75553 (&wdev->mtx){+.+.}, at: nl80211_authenticate+0x2b1/0x360
 #4: 00000000056b83e2 (&local->sta_mtx){+.+.}, at: __sta_info_flush+0x94/0x190

stack backtrace:
CPU: 3 PID: 708 Comm: wpa_supplicant Not tainted 4.18.0-rc8+ #1573
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
Call Trace:
 dump_stack+0x85/0xcb
 print_circular_bug.isra.19+0x1d1/0x2c0
 check_prev_add.constprop.27+0x542/0x7c0
 __lock_acquire+0x1107/0x11e0
 lock_acquire+0xa3/0x230
 flush_work+0x195/0x280
 __cancel_work_timer+0x137/0x1e0
 ieee80211_sta_tear_down_BA_sessions+0x8d/0x150
 __sta_info_destroy_part1+0x5d/0x9c0
 __sta_info_flush+0xd8/0x190
 ieee80211_set_disassoc+0xd2/0x7d0
 ieee80211_mgd_auth+0x200/0x340
 cfg80211_mlme_auth+0x152/0x440
 nl80211_authenticate+0x2fd/0x360
 genl_family_rcv_msg+0x2aa/0x370
 genl_rcv_msg+0x47/0x90
 netlink_rcv_skb+0x4a/0x110
 genl_rcv+0x24/0x40
 netlink_unicast+0x186/0x230
 netlink_sendmsg+0x3b7/0x3e0
 sock_sendmsg+0x16/0x20
 ___sys_sendmsg+0x2d5/0x2f0
 __sys_sendmsg+0x51/0x90
 do_syscall_64+0x59/0x390
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync()
  2018-08-21 12:03 [PATCH 0/2] workqueue lockdep limitations/bugs Johannes Berg
@ 2018-08-21 12:03 ` Johannes Berg
  2018-08-21 16:08   ` Tejun Heo
  2018-08-21 12:03 ` [PATCH 2/2] workqueue: create lockdep dependency in flush_work() Johannes Berg
  2018-08-21 16:00 ` [PATCH 0/2] workqueue lockdep limitations/bugs Tejun Heo
  2 siblings, 1 reply; 22+ messages in thread
From: Johannes Berg @ 2018-08-21 12:03 UTC (permalink / raw)
  To: Tejun Heo, Lai Jiangshan; +Cc: linux-kernel, linux-wireless, Johannes Berg

From: Johannes Berg <johannes.berg@intel.com>

In cancel_work_sync(), we can only have one of two cases, even
with an ordered workqueue:
 * the work isn't running, just cancelled before it started
 * the work is running, but then nothing else can be on the
   workqueue before it

Thus, we need to skip the lockdep workqueue dependency handling,
otherwise we get false positive reports from lockdep saying that
we have a potential deadlock when the workqueue also has other
work items with locking, e.g.

  work1_function() { mutex_lock(&mutex); ... }
  work2_function() { /* nothing */ }

  other_function() {
    queue_work(ordered_wq, &work1);
    queue_work(ordered_wq, &work2);
    mutex_lock(&mutex);
    cancel_work_sync(&work2);
  }

As described above, this isn't a problem, but lockdep will
currently flag it as if cancel_work_sync() was flush_work(),
which *is* a problem.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
 kernel/workqueue.c | 37 ++++++++++++++++++++++---------------
 1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 78b192071ef7..a6c2b823f348 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2843,7 +2843,8 @@ void drain_workqueue(struct workqueue_struct *wq)
 }
 EXPORT_SYMBOL_GPL(drain_workqueue);
 
-static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
+static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
+			     bool from_cancel)
 {
 	struct worker *worker = NULL;
 	struct worker_pool *pool;
@@ -2885,7 +2886,8 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
 	 * workqueues the deadlock happens when the rescuer stalls, blocking
 	 * forward progress.
 	 */
-	if (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer) {
+	if (!from_cancel &&
+	    (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer)) {
 		lock_map_acquire(&pwq->wq->lockdep_map);
 		lock_map_release(&pwq->wq->lockdep_map);
 	}
@@ -2896,6 +2898,22 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
 	return false;
 }
 
+static bool __flush_work(struct work_struct *work, bool from_cancel)
+{
+	struct wq_barrier barr;
+
+	if (WARN_ON(!wq_online))
+		return false;
+
+	if (start_flush_work(work, &barr, from_cancel)) {
+		wait_for_completion(&barr.done);
+		destroy_work_on_stack(&barr.work);
+		return true;
+	} else {
+		return false;
+	}
+}
+
 /**
  * flush_work - wait for a work to finish executing the last queueing instance
  * @work: the work to flush
@@ -2909,18 +2927,7 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
  */
 bool flush_work(struct work_struct *work)
 {
-	struct wq_barrier barr;
-
-	if (WARN_ON(!wq_online))
-		return false;
-
-	if (start_flush_work(work, &barr)) {
-		wait_for_completion(&barr.done);
-		destroy_work_on_stack(&barr.work);
-		return true;
-	} else {
-		return false;
-	}
+	return __flush_work(work, false);
 }
 EXPORT_SYMBOL_GPL(flush_work);
 
@@ -2986,7 +2993,7 @@ static bool __cancel_work_timer(struct work_struct *work, bool is_dwork)
 	 * isn't executing.
 	 */
 	if (wq_online)
-		flush_work(work);
+		__flush_work(work, true);
 
 	clear_work_data(work);
 
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/2] workqueue: create lockdep dependency in flush_work()
  2018-08-21 12:03 [PATCH 0/2] workqueue lockdep limitations/bugs Johannes Berg
  2018-08-21 12:03 ` [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync() Johannes Berg
@ 2018-08-21 12:03 ` Johannes Berg
  2018-08-21 16:09   ` Tejun Heo
  2018-08-21 16:00 ` [PATCH 0/2] workqueue lockdep limitations/bugs Tejun Heo
  2 siblings, 1 reply; 22+ messages in thread
From: Johannes Berg @ 2018-08-21 12:03 UTC (permalink / raw)
  To: Tejun Heo, Lai Jiangshan; +Cc: linux-kernel, linux-wireless, Johannes Berg

From: Johannes Berg <johannes.berg@intel.com>

In flush_work(), we need to create a lockdep dependency so that
the following scenario is appropriately tagged as a problem:

  work_function()
  {
    mutex_lock(&mutex);
    ...
  }

  other_function()
  {
    mutex_lock(&mutex);
    flush_work(&work); // or cancel_work_sync(&work);
  }

This is a problem since the work might be running and be blocked
on trying to acquire the mutex.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
 kernel/workqueue.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index a6c2b823f348..29d4d84ba512 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2852,6 +2852,11 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
 
 	might_sleep();
 
+	if (!from_cancel) {
+		lock_map_acquire(&work->lockdep_map);
+		lock_map_release(&work->lockdep_map);
+	}
+
 	local_irq_disable();
 	pool = get_work_pool(work);
 	if (!pool) {
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/2] workqueue lockdep limitations/bugs
  2018-08-21 12:03 [PATCH 0/2] workqueue lockdep limitations/bugs Johannes Berg
  2018-08-21 12:03 ` [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync() Johannes Berg
  2018-08-21 12:03 ` [PATCH 2/2] workqueue: create lockdep dependency in flush_work() Johannes Berg
@ 2018-08-21 16:00 ` Tejun Heo
  2018-08-21 17:15   ` Johannes Berg
  2 siblings, 1 reply; 22+ messages in thread
From: Tejun Heo @ 2018-08-21 16:00 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Lai Jiangshan, linux-kernel, linux-wireless

Hello, Johannes.

On Tue, Aug 21, 2018 at 02:03:15PM +0200, Johannes Berg wrote:
> 3)
> The third problem is more complicated, but it means that we likely miss a
> lot of potential locking problems with ordered workqueues, and I think we
> should really try to fix this somehow.
> 
> Let's say we again have an ordered workqueue, and the following:
> 
> work1_function
> {
> 	mutex_lock(&mutex);
> }

Regular mutexes complain when the locker isn't the unlocker already.
Do we really care about this case?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync()
  2018-08-21 12:03 ` [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync() Johannes Berg
@ 2018-08-21 16:08   ` Tejun Heo
  2018-08-21 17:18     ` Johannes Berg
  0 siblings, 1 reply; 22+ messages in thread
From: Tejun Heo @ 2018-08-21 16:08 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Lai Jiangshan, linux-kernel, linux-wireless, Johannes Berg

On Tue, Aug 21, 2018 at 02:03:16PM +0200, Johannes Berg wrote:
> From: Johannes Berg <johannes.berg@intel.com>
> 
> In cancel_work_sync(), we can only have one of two cases, even
> with an ordered workqueue:
>  * the work isn't running, just cancelled before it started
>  * the work is running, but then nothing else can be on the
>    workqueue before it
> 
> Thus, we need to skip the lockdep workqueue dependency handling,
> otherwise we get false positive reports from lockdep saying that
> we have a potential deadlock when the workqueue also has other
> work items with locking, e.g.
> 
>   work1_function() { mutex_lock(&mutex); ... }
>   work2_function() { /* nothing */ }
> 
>   other_function() {
>     queue_work(ordered_wq, &work1);
>     queue_work(ordered_wq, &work2);
>     mutex_lock(&mutex);
>     cancel_work_sync(&work2);
>   }
> 
> As described above, this isn't a problem, but lockdep will
> currently flag it as if cancel_work_sync() was flush_work(),
> which *is* a problem.
> 
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
> ---
>  kernel/workqueue.c | 37 ++++++++++++++++++++++---------------
>  1 file changed, 22 insertions(+), 15 deletions(-)
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 78b192071ef7..a6c2b823f348 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -2843,7 +2843,8 @@ void drain_workqueue(struct workqueue_struct *wq)
>  }
>  EXPORT_SYMBOL_GPL(drain_workqueue);
>  
> -static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
> +static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
> +			     bool from_cancel)
>  {
>  	struct worker *worker = NULL;
>  	struct worker_pool *pool;
> @@ -2885,7 +2886,8 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
>  	 * workqueues the deadlock happens when the rescuer stalls, blocking
>  	 * forward progress.
>  	 */
> -	if (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer) {
> +	if (!from_cancel &&
> +	    (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer)) {
>  		lock_map_acquire(&pwq->wq->lockdep_map);
>  		lock_map_release(&pwq->wq->lockdep_map);
>  	}

But this can lead to a deadlock.  I'd much rather err on the side of
discouraging complex lock dancing around ordered workqueues, no?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] workqueue: create lockdep dependency in flush_work()
  2018-08-21 12:03 ` [PATCH 2/2] workqueue: create lockdep dependency in flush_work() Johannes Berg
@ 2018-08-21 16:09   ` Tejun Heo
  2018-08-21 17:19     ` Johannes Berg
  0 siblings, 1 reply; 22+ messages in thread
From: Tejun Heo @ 2018-08-21 16:09 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Lai Jiangshan, linux-kernel, linux-wireless, Johannes Berg

On Tue, Aug 21, 2018 at 02:03:17PM +0200, Johannes Berg wrote:
> From: Johannes Berg <johannes.berg@intel.com>
> 
> In flush_work(), we need to create a lockdep dependency so that
> the following scenario is appropriately tagged as a problem:
> 
>   work_function()
>   {
>     mutex_lock(&mutex);
>     ...
>   }
> 
>   other_function()
>   {
>     mutex_lock(&mutex);
>     flush_work(&work); // or cancel_work_sync(&work);
>   }
> 
> This is a problem since the work might be running and be blocked
> on trying to acquire the mutex.
> 
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>

This makes sense to me.  Did you notice any extra lockdep warnings
with this?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/2] workqueue lockdep limitations/bugs
  2018-08-21 16:00 ` [PATCH 0/2] workqueue lockdep limitations/bugs Tejun Heo
@ 2018-08-21 17:15   ` Johannes Berg
  0 siblings, 0 replies; 22+ messages in thread
From: Johannes Berg @ 2018-08-21 17:15 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Lai Jiangshan, linux-kernel, linux-wireless

Hi Tejun,

> > Let's say we again have an ordered workqueue, and the following:
> > 
> > work1_function
> > {
> > 	mutex_lock(&mutex);
> > }
> 
> Regular mutexes complain when the locker isn't the unlocker already.
> Do we really care about this case?

Oh, sorry for the confusion. I was just eliding the - not very
interesting for this case - unlock. Really I should've typed up a
lock/unlock pair in all of the examples.

johannes

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync()
  2018-08-21 16:08   ` Tejun Heo
@ 2018-08-21 17:18     ` Johannes Berg
  2018-08-21 17:27       ` Tejun Heo
  0 siblings, 1 reply; 22+ messages in thread
From: Johannes Berg @ 2018-08-21 17:18 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Lai Jiangshan, linux-kernel, linux-wireless

On Tue, 2018-08-21 at 09:08 -0700, Tejun Heo wrote:
> 
> > -static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
> > +static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
> > +			     bool from_cancel)
> >  {
> >  	struct worker *worker = NULL;
> >  	struct worker_pool *pool;
> > @@ -2885,7 +2886,8 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr)
> >  	 * workqueues the deadlock happens when the rescuer stalls, blocking
> >  	 * forward progress.
> >  	 */
> > -	if (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer) {
> > +	if (!from_cancel &&
> > +	    (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer)) {
> >  		lock_map_acquire(&pwq->wq->lockdep_map);
> >  		lock_map_release(&pwq->wq->lockdep_map);
> >  	}
> 
> But this can lead to a deadlock.  I'd much rather err on the side of
> discouraging complex lock dancing around ordered workqueues, no?

What can lead to a deadlock?

Writing out the example again, with the unlock now:

   work1_function() { mutex_lock(&mutex); mutex_unlock(&mutex); }
   work2_function() { /* nothing */ }
 
   other_function() {
     queue_work(ordered_wq, &work1);
     queue_work(ordered_wq, &work2);
     mutex_lock(&mutex);
     cancel_work_sync(&work2);
     mutex_unlock(&mutex);
   }

This shouldn't be able to lead to a deadlock like I had explained:

> In cancel_work_sync(), we can only have one of two cases, even
> with an ordered workqueue:
>  * the work isn't running, just cancelled before it started
>  * the work is running, but then nothing else can be on the
>    workqueue before it

johannes

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/2] workqueue: create lockdep dependency in flush_work()
  2018-08-21 16:09   ` Tejun Heo
@ 2018-08-21 17:19     ` Johannes Berg
  0 siblings, 0 replies; 22+ messages in thread
From: Johannes Berg @ 2018-08-21 17:19 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Lai Jiangshan, linux-kernel, linux-wireless

On Tue, 2018-08-21 at 09:09 -0700, Tejun Heo wrote:
> 
> This makes sense to me.  Did you notice any extra lockdep warnings
> with this?

I haven't used it beyond testing the trivial example I created on the
spot with a small mac80211 change, so can't really say either way.

johannes

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync()
  2018-08-21 17:18     ` Johannes Berg
@ 2018-08-21 17:27       ` Tejun Heo
  2018-08-21 17:30         ` Johannes Berg
  0 siblings, 1 reply; 22+ messages in thread
From: Tejun Heo @ 2018-08-21 17:27 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Lai Jiangshan, linux-kernel, linux-wireless

Hello, Johannes.

On Tue, Aug 21, 2018 at 07:18:14PM +0200, Johannes Berg wrote:
> > But this can lead to a deadlock.  I'd much rather err on the side of
> > discouraging complex lock dancing around ordered workqueues, no?
> 
> What can lead to a deadlock?

Oh not this particular case, but I was wondering whether we'd be
missing legitimate possible deadlock cases by skipping lockdep for all
cancel_work_sync()'s as they can actually flush.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync()
  2018-08-21 17:27       ` Tejun Heo
@ 2018-08-21 17:30         ` Johannes Berg
  2018-08-21 17:55           ` Tejun Heo
  0 siblings, 1 reply; 22+ messages in thread
From: Johannes Berg @ 2018-08-21 17:30 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Lai Jiangshan, linux-kernel, linux-wireless

Hi,

> On Tue, Aug 21, 2018 at 07:18:14PM +0200, Johannes Berg wrote:
> > > But this can lead to a deadlock.  I'd much rather err on the side of
> > > discouraging complex lock dancing around ordered workqueues, no?
> > 
> > What can lead to a deadlock?
> 
> Oh not this particular case, but I was wondering whether we'd be
> missing legitimate possible deadlock cases by skipping lockdep for all
> cancel_work_sync()'s as they can actually flush.

I don't see how? This is only relevant in ordered/single-threaded WQs,
but even there it doesn't matter doesn't matter as explained?

I'm actually seeing a false positive report from lockdep, because it
*is* flushing, i.e. I'm running into the case of the work actually
running, i.e. the "_sync" part of "cancel_work_sync()" is kicking in,
but in that case a single-threaded WQ can't have anything executing
*before* it, so we don't need to generate a lockdep dependency - and in
fact don't *want* to create one to avoid the false positive.

I'm not really sure what you think we might be missing? Am I missing
some case where cancel_work_sync() can possibly deadlock? Apart from the
issue I addressed in the second patch, obviously.

johannes

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync()
  2018-08-21 17:30         ` Johannes Berg
@ 2018-08-21 17:55           ` Tejun Heo
  2018-08-21 19:20             ` Johannes Berg
  0 siblings, 1 reply; 22+ messages in thread
From: Tejun Heo @ 2018-08-21 17:55 UTC (permalink / raw)
  To: Johannes Berg; +Cc: Lai Jiangshan, linux-kernel, linux-wireless

Hello,

On Tue, Aug 21, 2018 at 07:30:21PM +0200, Johannes Berg wrote:
> I don't see how? This is only relevant in ordered/single-threaded WQs,
> but even there it doesn't matter doesn't matter as explained?
> 
> I'm actually seeing a false positive report from lockdep, because it
> *is* flushing, i.e. I'm running into the case of the work actually
> running, i.e. the "_sync" part of "cancel_work_sync()" is kicking in,
> but in that case a single-threaded WQ can't have anything executing
> *before* it, so we don't need to generate a lockdep dependency - and in
> fact don't *want* to create one to avoid the false positive.
> 
> I'm not really sure what you think we might be missing? Am I missing
> some case where cancel_work_sync() can possibly deadlock? Apart from the
> issue I addressed in the second patch, obviously.

Ah, that was me being slow.  I thought you were skipping the work's
lockdep_map.  I can almost swear we had that before (the part you're
adding on the second patch).  Right, fd1a5b04dfb8 ("workqueue: Remove
now redundant lock acquisitions wrt. workqueue flushes") removed it
because it gets propagated through wait_for_completion().  Did we miss
some cases with that change?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync()
  2018-08-21 17:55           ` Tejun Heo
@ 2018-08-21 19:20             ` Johannes Berg
  2018-08-22  2:45               ` Byungchul Park
  0 siblings, 1 reply; 22+ messages in thread
From: Johannes Berg @ 2018-08-21 19:20 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Lai Jiangshan, linux-kernel, linux-wireless, Byungchul Park

On Tue, 2018-08-21 at 10:55 -0700, Tejun Heo wrote:

> > I'm not really sure what you think we might be missing? Am I missing
> > some case where cancel_work_sync() can possibly deadlock? Apart from the
> > issue I addressed in the second patch, obviously.
> 
> Ah, that was me being slow.  I thought you were skipping the work's
> lockdep_map.  I can almost swear we had that before (the part you're
> adding on the second patch).  Right, fd1a5b04dfb8 ("workqueue: Remove
> now redundant lock acquisitions wrt. workqueue flushes") removed it
> because it gets propagated through wait_for_completion().  Did we miss
> some cases with that change?

Hmm.

It doesn't seem to be working.

No, ok, actually it probably *does*, but the point is similar to my
issue # 3 before - we don't do any of this unless the work is actually
running, but we really want the lockdep annotation *regardless* of that,
so that we catch the error unconditionally.

So perhaps that commit just needs to be reverted entirely - I'd only
looked at a small subset of it, but the flush_workqueue() case has the
same problem - we only get to the completion when there's something to
flush, not when the workqueue happens to actually be empty. But again,
for lockdep we want to catch *potential* problems, not only *actual*
ones.

The remaining part of the patch I'm not sure I fully understand (removal
of lockdep_init_map_crosslock()), but I suppose if we revert the other
bits we need to revert this as well.

So please drop this patch, but revert Byungchul Park's commit
fd1a5b04dfb8 again, I don't think the lockdep annotations there are
really redundant as I just explained.

johannes

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync()
  2018-08-21 19:20             ` Johannes Berg
@ 2018-08-22  2:45               ` Byungchul Park
  2018-08-22  4:02                 ` Johannes Berg
  0 siblings, 1 reply; 22+ messages in thread
From: Byungchul Park @ 2018-08-22  2:45 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Tejun Heo, Lai Jiangshan, linux-kernel, linux-wireless, kernel-team

On Tue, Aug 21, 2018 at 09:20:41PM +0200, Johannes Berg wrote:
> On Tue, 2018-08-21 at 10:55 -0700, Tejun Heo wrote:
> 
> > > I'm not really sure what you think we might be missing? Am I missing
> > > some case where cancel_work_sync() can possibly deadlock? Apart from the
> > > issue I addressed in the second patch, obviously.
> > 
> > Ah, that was me being slow.  I thought you were skipping the work's
> > lockdep_map.  I can almost swear we had that before (the part you're
> > adding on the second patch).  Right, fd1a5b04dfb8 ("workqueue: Remove
> > now redundant lock acquisitions wrt. workqueue flushes") removed it
> > because it gets propagated through wait_for_completion().  Did we miss
> > some cases with that change?
> 
> Hmm.
> 
> It doesn't seem to be working.
> 
> No, ok, actually it probably *does*, but the point is similar to my
> issue # 3 before - we don't do any of this unless the work is actually
> running, but we really want the lockdep annotation *regardless* of that,
> so that we catch the error unconditionally.
> 
> So perhaps that commit just needs to be reverted entirely - I'd only
> looked at a small subset of it, but the flush_workqueue() case has the
> same problem - we only get to the completion when there's something to
> flush, not when the workqueue happens to actually be empty. But again,
> for lockdep we want to catch *potential* problems, not only *actual*
> ones.
> 
> The remaining part of the patch I'm not sure I fully understand (removal
> of lockdep_init_map_crosslock()), but I suppose if we revert the other
> bits we need to revert this as well.
> 
> So please drop this patch, but revert Byungchul Park's commit
> fd1a5b04dfb8 again, I don't think the lockdep annotations there are
> really redundant as I just explained.

That should've been adjusted as well when Ingo reverted Cross-release.
It would be much easier to add each pair, acquire/release, before
wait_for_completion() in both flush_workqueue() and flush_work() than
reverting the whole commit.

What's lacking is only lockdep annotations for wait_for_completion().

Byungchul

> 
> johannes

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync()
  2018-08-22  2:45               ` Byungchul Park
@ 2018-08-22  4:02                 ` Johannes Berg
  2018-08-22  5:47                   ` Byungchul Park
  0 siblings, 1 reply; 22+ messages in thread
From: Johannes Berg @ 2018-08-22  4:02 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Tejun Heo, Lai Jiangshan, linux-kernel, linux-wireless, kernel-team

On Wed, 2018-08-22 at 11:45 +0900, Byungchul Park wrote:

> That should've been adjusted as well when Ingo reverted Cross-release.

I can't really say.

> It would be much easier to add each pair, acquire/release, before
> wait_for_completion() in both flush_workqueue() and flush_work() than
> reverting the whole commit.

The commit doesn't do much more than this though.

> What's lacking is only lockdep annotations for wait_for_completion().

No, I disagree. Like I said before, we need the lockdep annotations on
the WQ even when we don't actually create/use the completion, so we can
catch issues even when they don't actually happen.

johannes

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync()
  2018-08-22  4:02                 ` Johannes Berg
@ 2018-08-22  5:47                   ` Byungchul Park
  2018-08-22  7:07                     ` Johannes Berg
  0 siblings, 1 reply; 22+ messages in thread
From: Byungchul Park @ 2018-08-22  5:47 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Tejun Heo, Lai Jiangshan, linux-kernel, linux-wireless, kernel-team

On Wed, Aug 22, 2018 at 06:02:23AM +0200, Johannes Berg wrote:
> On Wed, 2018-08-22 at 11:45 +0900, Byungchul Park wrote:
> 
> > That should've been adjusted as well when Ingo reverted Cross-release.
> 
> I can't really say.

What do you mean?

> > It would be much easier to add each pair, acquire/release, before
> > wait_for_completion() in both flush_workqueue() and flush_work() than
> > reverting the whole commit.
> 
> The commit doesn't do much more than this though.

That also has named of lockdep_map for wq/work in a better way.

> > What's lacking is only lockdep annotations for wait_for_completion().
> 
> No, I disagree. Like I said before, we need the lockdep annotations on

You seem to be confused. I was talking about wait_for_completion() in
both flush_workqueue() and flush_work(). Without
the wait_for_completion()s, nothing matters wrt what you are concerning.

> the WQ even when we don't actually create/use the completion, so we can
> catch issues even when they don't actually happen.

This is obviously true.

Byungchul

> 
> johannes

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync()
  2018-08-22  5:47                   ` Byungchul Park
@ 2018-08-22  7:07                     ` Johannes Berg
  2018-08-22  7:50                       ` Byungchul Park
  0 siblings, 1 reply; 22+ messages in thread
From: Johannes Berg @ 2018-08-22  7:07 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Tejun Heo, Lai Jiangshan, linux-kernel, linux-wireless, kernel-team

On Wed, 2018-08-22 at 14:47 +0900, Byungchul Park wrote:
> On Wed, Aug 22, 2018 at 06:02:23AM +0200, Johannes Berg wrote:
> > On Wed, 2018-08-22 at 11:45 +0900, Byungchul Park wrote:
> > 
> > > That should've been adjusted as well when Ingo reverted Cross-release.
> > 
> > I can't really say.
> 
> What do you mean?

I haven't followed any of this, so I just don't know.

> > > It would be much easier to add each pair, acquire/release, before
> > > wait_for_completion() in both flush_workqueue() and flush_work() than
> > > reverting the whole commit.
> > 
> > The commit doesn't do much more than this though.
> 
> That also has named of lockdep_map for wq/work in a better way.

What do you mean?

> > > What's lacking is only lockdep annotations for wait_for_completion().
> > 
> > No, I disagree. Like I said before, we need the lockdep annotations on
> 
> You seem to be confused. I was talking about wait_for_completion() in
> both flush_workqueue() and flush_work(). Without
> the wait_for_completion()s, nothing matters wrt what you are concerning.

Yes and no.

You're basically saying if we don't get to do a wait_for_completion(),
then we don't need any lockdep annotation. I'm saying this isn't true.

Consider the following case:

work_function()
{
	mutex_lock(&mutex);
	mutex_unlock(&mutex);
}

other_function()
{
	queue_work(&my_wq, &work);

	if (common_case) {
		schedule_and_wait_for_something_that_takes_a_long_time()
	}

	mutex_lock(&mutex);
	flush_workqueue(&my_wq);
	mutex_unlock(&mutex);
}


Clearly this code is broken, right?

However, you'll almost never get lockdep to indicate that, because of
the "if (common_case)".

My argument basically is that the lockdep annotations in the workqueue
code should be entirely independent of the actual need to call
wait_for_completion().

Therefore, the commit should be reverted regardless of any cross-release 
work (that I neither know and thus don't understand right now), since it
makes workqueue code rely on lockdep for the completion, whereas we
really want to have annotations here even when we didn't actually need
to wait_for_completion().

johannes

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync()
  2018-08-22  7:07                     ` Johannes Berg
@ 2018-08-22  7:50                       ` Byungchul Park
  2018-08-22  8:02                         ` Johannes Berg
  0 siblings, 1 reply; 22+ messages in thread
From: Byungchul Park @ 2018-08-22  7:50 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Tejun Heo, Lai Jiangshan, linux-kernel, linux-wireless, kernel-team

On Wed, Aug 22, 2018 at 09:07:23AM +0200, Johannes Berg wrote:
> On Wed, 2018-08-22 at 14:47 +0900, Byungchul Park wrote:
> > On Wed, Aug 22, 2018 at 06:02:23AM +0200, Johannes Berg wrote:
> > > On Wed, 2018-08-22 at 11:45 +0900, Byungchul Park wrote:
> > > 
> > > > That should've been adjusted as well when Ingo reverted Cross-release.
> > > 
> > > I can't really say.
> > 
> > What do you mean?
> 
> I haven't followed any of this, so I just don't know.
> 
> > > > It would be much easier to add each pair, acquire/release, before
> > > > wait_for_completion() in both flush_workqueue() and flush_work() than
> > > > reverting the whole commit.
> > > 
> > > The commit doesn't do much more than this though.
> > 
> > That also has named of lockdep_map for wq/work in a better way.
> 
> What do you mean?

Ah.. Not important thing. I just mentioned I changed lock names a bit
when initializing lockdep_map instances which was suggested by Ingo. But
no problem even if you revert the whole thing. I just informed it. ;)

> > > > What's lacking is only lockdep annotations for wait_for_completion().
> > > 
> > > No, I disagree. Like I said before, we need the lockdep annotations on
> > 
> > You seem to be confused. I was talking about wait_for_completion() in
> > both flush_workqueue() and flush_work(). Without
> > the wait_for_completion()s, nothing matters wrt what you are concerning.
> 
> Yes and no.
> 
> You're basically saying if we don't get to do a wait_for_completion(),
> then we don't need any lockdep annotation. I'm saying this isn't true.

Strictly no. But I'm just talking about the case in wq flush code.

> Consider the following case:
> 
> work_function()
> {
> 	mutex_lock(&mutex);
> 	mutex_unlock(&mutex);
> }
> 
> other_function()
> {
> 	queue_work(&my_wq, &work);
> 
> 	if (common_case) {
> 		schedule_and_wait_for_something_that_takes_a_long_time()
> 	}
> 
> 	mutex_lock(&mutex);
> 	flush_workqueue(&my_wq);
> 	mutex_unlock(&mutex);
> }
> 
> 
> Clearly this code is broken, right?
> 
> However, you'll almost never get lockdep to indicate that, because of
> the "if (common_case)".

Sorry I don't catch you. Why is that problem with the example? Please
a deadlock example.

> My argument basically is that the lockdep annotations in the workqueue
> code should be entirely independent of the actual need to call
> wait_for_completion().

No. Lockdep annotations always do with either wait_for_something or self
event loop within a single context e.g. fs -> memory reclaim -> fs -> ..

> Therefore, the commit should be reverted regardless of any cross-release

No. That is necessary only when the wait_for_completion() cannot be
tracked in checking dependencies automatically by cross-release.

It might be the key to understand you, could you explain it more why you
think lockdep annotations are independent of the actual need to call
wait_for_completion()(or wait_for_something_else) hopefully with a
deadlock example?

> work (that I neither know and thus don't understand right now), since it
> makes workqueue code rely on lockdep for the completion, whereas we

Using wait_for_completion(), right?

> really want to have annotations here even when we didn't actually need
> to wait_for_completion().

Please an example of deadlock even w/o wait_for_completion().

> 
> johannes

Byungchul


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync()
  2018-08-22  7:50                       ` Byungchul Park
@ 2018-08-22  8:02                         ` Johannes Berg
  2018-08-22  9:15                           ` Byungchul Park
  0 siblings, 1 reply; 22+ messages in thread
From: Johannes Berg @ 2018-08-22  8:02 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Tejun Heo, Lai Jiangshan, linux-kernel, linux-wireless, kernel-team

On Wed, 2018-08-22 at 16:50 +0900, Byungchul Park wrote:

> > You're basically saying if we don't get to do a wait_for_completion(),
> > then we don't need any lockdep annotation. I'm saying this isn't true.
> 
> Strictly no. But I'm just talking about the case in wq flush code.

Sure, I meant it's not true in the wq flush code, not talking about
anything else.

> > Consider the following case:
> > 
> > work_function()
> > {
> > 	mutex_lock(&mutex);
> > 	mutex_unlock(&mutex);
> > }
> > 
> > other_function()
> > {
> > 	queue_work(&my_wq, &work);
> > 
> > 	if (common_case) {
> > 		schedule_and_wait_for_something_that_takes_a_long_time()
> > 	}
> > 
> > 	mutex_lock(&mutex);
> > 	flush_workqueue(&my_wq);
> > 	mutex_unlock(&mutex);
> > }
> > 
> > 
> > Clearly this code is broken, right?
> > 
> > However, you'll almost never get lockdep to indicate that, because of
> > the "if (common_case)".
> 
> Sorry I don't catch you. Why is that problem with the example? Please
> a deadlock example.

sure, I thought that was easy:

thread 1			thread 2 (wq thread)

common_case = false;
queue_work(&my_wq, &work);
mutex_lock(&mutex);

flush_workqueue(&my_wq);
				work_function()
				-> mutex_lock(&mutex);
				-> schedule(), wait for mutex
wait_for_completion()

-> deadlock - we can't make any forward progress here.


> > My argument basically is that the lockdep annotations in the workqueue
> > code should be entirely independent of the actual need to call
> > wait_for_completion().
> 
> No. Lockdep annotations always do with either wait_for_something or self
> event loop within a single context e.g. fs -> memory reclaim -> fs -> ..

Yes, but I'm saying that we should catch *potential* deadlocks *before*
they happen.

See the example above. Clearly, you're actually deadlocking, and
obviously (assuming all the wait_for_completion() things work right)
lockdep will show why we just deadlocked.

BUT.

This is useless. We want/need lockdep to show *potential* deadlocks,
even when they didn't happen. Consider the other case in the above
scenario:

thread 1			thread 2 (wq thread)

common_case = true;
queue_work(&my_wq, &work);

schedule_and_wait_...();	work_function()
				-> mutex_lock(&mutex);
				-> mutex_unlock()
				done


mutex_lock(&mutex);
flush_workqueue(&my_wq);
-> nothing to do, will NOT
   call wait_for_completion();

-> no deadlock

Here we don't have a deadlock, but without the revert we will also not
get a lockdep report. We should though, because we're doing something
that's quite clearly dangerous - we simply don't know if the work
function will complete before we get to flush_workqueue(). Maybe the
work function has an uncommon case itself that takes forever, etc.

> > Therefore, the commit should be reverted regardless of any cross-release
> 
> No. That is necessary only when the wait_for_completion() cannot be
> tracked in checking dependencies automatically by cross-release.

No. See above. We want the annotation regardless of invoking
wait_for_completion().

> It might be the key to understand you, could you explain it more why you
> think lockdep annotations are independent of the actual need to call
> wait_for_completion()(or wait_for_something_else) hopefully with a
> deadlock example?

See above.

You're getting too hung up about a deadlock example. We don't want to
have lockdep only catch *actual* deadlocks. The code I wrote clearly has
a potential deadlock (sequence given above), but in most cases the code
above will *not* deadlock. This is the interesting part we want lockdep
to catch.

> > work (that I neither know and thus don't understand right now), since it
> > makes workqueue code rely on lockdep for the completion, whereas we
> 
> Using wait_for_completion(), right?

Yes.

> > really want to have annotations here even when we didn't actually need
> > to wait_for_completion().
> 
> Please an example of deadlock even w/o wait_for_completion().

No, here's where you get confused. Clearly, there is no lockdep if we
don't do wait_for_completion(). But if you have the code above, lockdep
should still warn about the potential deadlock that happens when you
*do* get to wait_for_completion(). Lockdep shouldn't be restricted to
warning about a deadlock that *actually* happened.

johannes

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync()
  2018-08-22  8:02                         ` Johannes Berg
@ 2018-08-22  9:15                           ` Byungchul Park
  2018-08-22  9:42                             ` Johannes Berg
  0 siblings, 1 reply; 22+ messages in thread
From: Byungchul Park @ 2018-08-22  9:15 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Tejun Heo, Lai Jiangshan, linux-kernel, linux-wireless, kernel-team

On Wed, Aug 22, 2018 at 10:02:20AM +0200, Johannes Berg wrote:
> > Sorry I don't catch you. Why is that problem with the example? Please
> > a deadlock example.
> 
> sure, I thought that was easy:
> 
> thread 1			thread 2 (wq thread)
> 
> common_case = false;
> queue_work(&my_wq, &work);
> mutex_lock(&mutex);
> 
> flush_workqueue(&my_wq);
> 				work_function()
> 				-> mutex_lock(&mutex);
> 				-> schedule(), wait for mutex
> wait_for_completion()
> 
> -> deadlock - we can't make any forward progress here.

I see. Now, I understand what you are talking about.

   if (common_case)
	schedule_and_wait_for_something_that_takes_a_long_time()

should be:

   if (common_case)
	schedule_and_wait_long_time_so_that_the_work_to_finish()

> > > My argument basically is that the lockdep annotations in the workqueue
> > > code should be entirely independent of the actual need to call
> > > wait_for_completion().
> > 
> > No. Lockdep annotations always do with either wait_for_something or self
> > event loop within a single context e.g. fs -> memory reclaim -> fs -> ..
> 
> Yes, but I'm saying that we should catch *potential* deadlocks *before*
> they happen.

Absolutely true about what lockdep does.

> See the example above. Clearly, you're actually deadlocking, and
> obviously (assuming all the wait_for_completion() things work right)
> lockdep will show why we just deadlocked.
> 
> BUT.
> 
> This is useless. We want/need lockdep to show *potential* deadlocks,
> even when they didn't happen. Consider the other case in the above
> scenario:
> 
> thread 1			thread 2 (wq thread)
> 
> common_case = true;
> queue_work(&my_wq, &work);
> 
> schedule_and_wait_...();	work_function()
> 				-> mutex_lock(&mutex);
> 				-> mutex_unlock()
> 				done
> 
> 
> mutex_lock(&mutex);
> flush_workqueue(&my_wq);
> -> nothing to do, will NOT
>    call wait_for_completion();
> 
> -> no deadlock

Sure. It's not a deadlock because wait_for_completion() is not involved
in this path - where wait_for_completion() is necessary to deadlock.

Ok. I didn't know what you are talking about but now I got it.

You are talking about who knows whether common_case is true or not,
so we must aggresively consider the case the wait_for_completion()
so far hasn't been called but may be called in the future.

I think it's a problem of how aggressively we need to check dependencies.
If we choose the aggressive option, then we could get reported
aggressively but could not avoid aggresive false positives either.
I don't want to strongly argue that because it's a problem of policy.

Just, I would consider only waits that actually happened anyway. Of
course, we gotta consider the waits definitely even if any actual
deadlock doesn't happen since detecting potantial problem is more
important than doing on actual ones as you said.

So no big objection to check dependencies aggressively.

> Here we don't have a deadlock, but without the revert we will also not

You misunderstand me. The commit should be reverted or acquire/release
pairs should be added in both flush functions.

> get a lockdep report. We should though, because we're doing something
> that's quite clearly dangerous - we simply don't know if the work
> function will complete before we get to flush_workqueue(). Maybe the
> work function has an uncommon case itself that takes forever, etc.
> 
> > > Therefore, the commit should be reverted regardless of any cross-release
> > 
> > No. That is necessary only when the wait_for_completion() cannot be
> > tracked in checking dependencies automatically by cross-release.
> 
> No. See above. We want the annotation regardless of invoking
> wait_for_completion().

Anyway the annotation should be placed in the path where
wait_for_completion() might be called. But whether doing it
regardless of invoking wait_for_completion() or not is a problem of
policy so I don't care. No big objection whichever you do.

> > It might be the key to understand you, could you explain it more why you
> > think lockdep annotations are independent of the actual need to call
> > wait_for_completion()(or wait_for_something_else) hopefully with a
> > deadlock example?
> 
> See above.
> 
> You're getting too hung up about a deadlock example. We don't want to

No no. Your example was helpful to understand what you are talking about.
I asked you for a potential deadlock example.

> have lockdep only catch *actual* deadlocks. The code I wrote clearly has
> a potential deadlock (sequence given above), but in most cases the code
> above will *not* deadlock. This is the interesting part we want lockdep
> to catch.

Absolutly true. You can find my opinion about that in
Documentation/locking/crossrelease.txt which has been removed because
crossrelease is strong at detecting *potential* deadlock problems.

> > > work (that I neither know and thus don't understand right now), since it
> > > makes workqueue code rely on lockdep for the completion, whereas we
> > 
> > Using wait_for_completion(), right?
> 
> Yes.
> 
> > > really want to have annotations here even when we didn't actually need
> > > to wait_for_completion().
> > 
> > Please an example of deadlock even w/o wait_for_completion().
> 
> No, here's where you get confused. Clearly, there is no lockdep if we
> don't do wait_for_completion(). But if you have the code above, lockdep
> should still warn about the potential deadlock that happens when you
> *do* get to wait_for_completion(). Lockdep shouldn't be restricted to
> warning about a deadlock that *actually* happened.

I didn't know if you were talking about this in the previous talk.

Byungchul


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync()
  2018-08-22  9:15                           ` Byungchul Park
@ 2018-08-22  9:42                             ` Johannes Berg
  2018-08-22 12:47                               ` Byungchul Park
  0 siblings, 1 reply; 22+ messages in thread
From: Johannes Berg @ 2018-08-22  9:42 UTC (permalink / raw)
  To: Byungchul Park
  Cc: Tejun Heo, Lai Jiangshan, linux-kernel, linux-wireless, kernel-team

On Wed, 2018-08-22 at 18:15 +0900, Byungchul Park wrote:
> 
> > thread 1			thread 2 (wq thread)
> > 
> > common_case = false;
> > queue_work(&my_wq, &work);
> > mutex_lock(&mutex);
> > 
> > flush_workqueue(&my_wq);
> > 				work_function()
> > 				-> mutex_lock(&mutex);
> > 				-> schedule(), wait for mutex
> > wait_for_completion()
> > 
> > -> deadlock - we can't make any forward progress here.
> 
> I see. Now, I understand what you are talking about.
> 
>    if (common_case)
> 	schedule_and_wait_for_something_that_takes_a_long_time()
> 
> should be:
> 
>    if (common_case)
> 	schedule_and_wait_long_time_so_that_the_work_to_finish()

Fair point.

> Ok. I didn't know what you are talking about but now I got it.

great.

> You are talking about who knows whether common_case is true or not,
> so we must aggresively consider the case the wait_for_completion()
> so far hasn't been called but may be called in the future.

Yes.

> I think it's a problem of how aggressively we need to check dependencies.
> If we choose the aggressive option, then we could get reported
> aggressively but could not avoid aggresive false positives either.
> I don't want to strongly argue that because it's a problem of policy.

I don't think you could consider a report from "aggressive reporting" to
 be a false positive. It's clearly possible that this happens, and once
you go to a workqueue you basically don't really know your sequencing
and timing any more.

> Just, I would consider only waits that actually happened anyway. Of
> course, we gotta consider the waits definitely even if any actual
> deadlock doesn't happen since detecting potantial problem is more
> important than doing on actual ones as you said.
> 
> So no big objection to check dependencies aggressively.
> 
> > Here we don't have a deadlock, but without the revert we will also not
> 
> You misunderstand me. The commit should be reverted or acquire/release
> pairs should be added in both flush functions.

Ok, I thought you were arguing we shouldn't revert it :)

I don't know whether to revert or just add the pairs in the flush
functions, because I can't say I understand what the third part of the
patch does.

> Anyway the annotation should be placed in the path where
> wait_for_completion() might be called.

Yes, it should be called regardless of whether we actually wait or not,
i.e. before most of the checking in the functions.

My issue #3 that I outlined is related - we'd like to have lockdep
understand that if this work was on the WQ it might deadlock, but we
don't have a way to get the WQ unless the work is scheduled - and in the
case that it is scheduled we might already hitting the deadlock
(depending on the order of execution though I guess).

> Absolutly true. You can find my opinion about that in
> Documentation/locking/crossrelease.txt which has been removed because
> crossrelease is strong at detecting *potential* deadlock problems.

Ok, you know what, I'm going to read this now ... hang on........

I see. You were trying to solve the problem of completions/context
transfers in lockdep.

Given the revert of crossrelease though, we can't actually revert your
patch anyway, and looking at it again I see what you mean by the "name"
now ...

So yeah, we should only re-add the two acquire/release pairs. I guess
I'll make a patch for that, replacing my patch 2.

I'm intrigued by the crossrelease - but that's a separate topic.

johannes

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync()
  2018-08-22  9:42                             ` Johannes Berg
@ 2018-08-22 12:47                               ` Byungchul Park
  0 siblings, 0 replies; 22+ messages in thread
From: Byungchul Park @ 2018-08-22 12:47 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Tejun Heo, Lai Jiangshan, linux-kernel, linux-wireless, kernel-team



On 08/22/2018 06:42 PM, Johannes Berg wrote:
> On Wed, 2018-08-22 at 18:15 +0900, Byungchul Park wrote:
>>> thread 1			thread 2 (wq thread)
>>>
>>> common_case = false;
>>> queue_work(&my_wq, &work);
>>> mutex_lock(&mutex);
>>>
>>> flush_workqueue(&my_wq);
>>> 				work_function()
>>> 				-> mutex_lock(&mutex);
>>> 				-> schedule(), wait for mutex
>>> wait_for_completion()
>>>
>>> -> deadlock - we can't make any forward progress here.
>> I see. Now, I understand what you are talking about.
>>
>>     if (common_case)
>> 	schedule_and_wait_for_something_that_takes_a_long_time()
>>
>> should be:
>>
>>     if (common_case)
>> 	schedule_and_wait_long_time_so_that_the_work_to_finish()
> Fair point.
>
>> Ok. I didn't know what you are talking about but now I got it.
> great.
>
>> You are talking about who knows whether common_case is true or not,
>> so we must aggresively consider the case the wait_for_completion()
>> so far hasn't been called but may be called in the future.
> Yes.
>
>> I think it's a problem of how aggressively we need to check dependencies.
>> If we choose the aggressive option, then we could get reported
>> aggressively but could not avoid aggresive false positives either.
>> I don't want to strongly argue that because it's a problem of policy.
> I don't think you could consider a report from "aggressive reporting" to
>   be a false positive. It's clearly possible that this happens, and once
> you go to a workqueue you basically don't really know your sequencing
> and timing any more.
>
>> Just, I would consider only waits that actually happened anyway. Of
>> course, we gotta consider the waits definitely even if any actual
>> deadlock doesn't happen since detecting potantial problem is more
>> important than doing on actual ones as you said.
>>
>> So no big objection to check dependencies aggressively.
>>
>>> Here we don't have a deadlock, but without the revert we will also not
>> You misunderstand me. The commit should be reverted or acquire/release
>> pairs should be added in both flush functions.
> Ok, I thought you were arguing we shouldn't revert it :)
>
> I don't know whether to revert or just add the pairs in the flush
> functions, because I can't say I understand what the third part of the
> patch does.
>
>> Anyway the annotation should be placed in the path where
>> wait_for_completion() might be called.
> Yes, it should be called regardless of whether we actually wait or not,
> i.e. before most of the checking in the functions.
>
> My issue #3 that I outlined is related - we'd like to have lockdep
> understand that if this work was on the WQ it might deadlock, but we
> don't have a way to get the WQ unless the work is scheduled - and in the
> case that it is scheduled we might already hitting the deadlock
> (depending on the order of execution though I guess).
>
>> Absolutly true. You can find my opinion about that in
>> Documentation/locking/crossrelease.txt which has been removed because
>> crossrelease is strong at detecting *potential* deadlock problems.
> Ok, you know what, I'm going to read this now ... hang on........

But very sorry for poor English on it in advance...

The document should have gotten contributed by ones who
are good at English. It might be too naive for you to read.

>
> I see. You were trying to solve the problem of completions/context
> transfers in lockdep.
>
> Given the revert of crossrelease though, we can't actually revert your
> patch anyway, and looking at it again I see what you mean by the "name"
> now ...
>
> So yeah, we should only re-add the two acquire/release pairs. I guess
> I'll make a patch for that, replacing my patch 2.
>
> I'm intrigued by the crossrelease - but that's a separate topic.
>
> johannes
>


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2018-08-22 12:48 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-21 12:03 [PATCH 0/2] workqueue lockdep limitations/bugs Johannes Berg
2018-08-21 12:03 ` [PATCH 1/2] workqueue: skip lockdep wq dependency in cancel_work_sync() Johannes Berg
2018-08-21 16:08   ` Tejun Heo
2018-08-21 17:18     ` Johannes Berg
2018-08-21 17:27       ` Tejun Heo
2018-08-21 17:30         ` Johannes Berg
2018-08-21 17:55           ` Tejun Heo
2018-08-21 19:20             ` Johannes Berg
2018-08-22  2:45               ` Byungchul Park
2018-08-22  4:02                 ` Johannes Berg
2018-08-22  5:47                   ` Byungchul Park
2018-08-22  7:07                     ` Johannes Berg
2018-08-22  7:50                       ` Byungchul Park
2018-08-22  8:02                         ` Johannes Berg
2018-08-22  9:15                           ` Byungchul Park
2018-08-22  9:42                             ` Johannes Berg
2018-08-22 12:47                               ` Byungchul Park
2018-08-21 12:03 ` [PATCH 2/2] workqueue: create lockdep dependency in flush_work() Johannes Berg
2018-08-21 16:09   ` Tejun Heo
2018-08-21 17:19     ` Johannes Berg
2018-08-21 16:00 ` [PATCH 0/2] workqueue lockdep limitations/bugs Tejun Heo
2018-08-21 17:15   ` Johannes Berg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).