All of lore.kernel.org
 help / color / mirror / Atom feed
* [btrfs/rt] lockdep false positive
@ 2017-01-22  8:46 Mike Galbraith
  2017-01-22 17:45 ` Mike Galbraith
  0 siblings, 1 reply; 9+ messages in thread
From: Mike Galbraith @ 2017-01-22  8:46 UTC (permalink / raw)
  To: LKML, linux-rt-users

Greetings btrfs/lockdep wizards,

RT trees have trouble with the BTRFS lockdep positive avoidance lock
class dance (see disk-io.c).  Seems the trouble is due to RT not having
a means of telling lockdep that its rwlocks are recursive for read by
the lock owner only, combined with the BTRFS lock class dance assuming
that read_lock() is annotated rwlock_acquire_read(), which RT cannot
do, as that would be a big fat lie.

Creating a rt_read_lock_shared() for btrfs_clear_lock_blocking_rw() did
indeed make lockdep happy as a clam for test purposes.  (hm, submitting
that would be excellent way to replenish frozen shark supply:)

Ideas?

The below is tip-rt, but that's irrelevant.  Any RT tree will do, you
just might hit the recently fixed log_mutex gripe instead of the btrfs
-tree-00/btrfs-csum-00 variants you'll eventually hit with log_mutex
splat fixed.

[  433.956516] =============================================
[  433.956516] [ INFO: possible recursive locking detected ]
[  433.956518] 4.10.0-rt1-tip-rt #36 Tainted: G            E  
[  433.956518] ---------------------------------------------
[  433.956519] kworker/u8:2/555 is trying to acquire lock:
[  433.956519]  (btrfs-csum-00){+.+...}, at: [<ffffffffa037fd14>] btrfs_clear_lock_blocking_rw+0x74/0x130 [btrfs]
[  433.956540] 
               but task is already holding lock:
[  433.956540]  (btrfs-csum-00){+.+...}, at: [<ffffffffa037fd14>] btrfs_clear_lock_blocking_rw+0x74/0x130 [btrfs]
[  433.956551] 
               other info that might help us debug this:
[  433.956551]  Possible unsafe locking scenario:

[  433.956552]        CPU0
[  433.956552]        ----
[  433.956552]   lock(btrfs-csum-00);
[  433.956552]   lock(btrfs-csum-00);
[  433.956553] 
                *** DEADLOCK ***

[  433.956553]  May be due to missing lock nesting notation

[  433.956554] 6 locks held by kworker/u8:2/555:
[  433.956554]  #0:  ("%s-%s""btrfs", name){.+.+..}, at: [<ffffffff8109f771>] process_one_work+0x171/0x700
[  433.956565]  #1:  ((&work->normal_work)){+.+...}, at: [<ffffffff8109f771>] process_one_work+0x171/0x700
[  433.956567]  #2:  (sb_internal){.+.+..}, at: [<ffffffffa033d4d7>] start_transaction+0x2a7/0x5a0 [btrfs]
[  433.956576]  #3:  (btrfs-csum-02){+.+...}, at: [<ffffffffa037fd14>] btrfs_clear_lock_blocking_rw+0x74/0x130 [btrfs]
[  433.956585]  #4:  (btrfs-csum-01){+.+...}, at: [<ffffffffa037fd14>] btrfs_clear_lock_blocking_rw+0x74/0x130 [btrfs]
[  433.956593]  #5:  (btrfs-csum-00){+.+...}, at: [<ffffffffa037fd14>] btrfs_clear_lock_blocking_rw+0x74/0x130 [btrfs]
[  433.956601] 

Lock class assignment leadin
 btrfs-transacti-623   [002] .......   406.637399: btrfs_set_buffer_lockdep_class: set &eb->lock: ffff88014a087ce0 level: 0 to btrfs-extent-00
    kworker/u8:5-558   [000] .......   429.673871: btrfs_set_buffer_lockdep_class: set &eb->lock: ffff880007073ce0 level: 2 to btrfs-csum-02
    kworker/u8:5-558   [000] .......   429.673904: btrfs_set_buffer_lockdep_class: set &eb->lock: ffff88014a087ce0 level: 1 to btrfs-csum-01
    kworker/u8:0-5     [002] .......   433.022595: btrfs_set_buffer_lockdep_class: set &eb->lock: ffff88009bd98fe0 level: 0 to btrfs-csum-00 *
    kworker/u8:2-555   [001] .......   433.838082: btrfs_set_buffer_lockdep_class: set &eb->lock: ffff880096e924e0 level: 0 to btrfs-csum-00

Our hero about to go splat
    kworker/u8:2-555   [000] .......   434.043172: btrfs_clear_lock_blocking_rw: read_lock(&eb->lock: ffff880007073ce0)  == btrfs-csum-02
    kworker/u8:2-555   [000] .....11   434.043172: btrfs_clear_lock_blocking_rw: read_lock(&eb->lock: ffff88014a087ce0)  == btrfs-csum-01
    kworker/u8:2-555   [000] .....12   434.043173: btrfs_clear_lock_blocking_rw: read_lock(&eb->lock: ffff88009bd98fe0)  == btrfs-csum-00  set by kworker/u8:0-5
    kworker/u8:2-555   [000] .....13   434.043173: btrfs_clear_lock_blocking_rw: read_lock(&eb->lock: ffff880096e924e0)  == btrfs-csum-00  set by hero - two locks, one key - splat

               stack backtrace:
[  433.956602] CPU: 0 PID: 555 Comm: kworker/u8:2 Tainted: G            E   4.10.0-rt1-tip-rt #36
[  433.956603] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20161202_174313-build11a 04/01/2014
[  433.956611] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[  433.956612] Call Trace:
[  433.956618]  dump_stack+0x85/0xc8
[  433.956622]  __lock_acquire+0x9f9/0x1550
[  433.956627]  ? ring_buffer_lock_reserve+0x115/0x3b0
[  433.956629]  ? ring_buffer_unlock_commit+0x27/0xe0
[  433.956630]  lock_acquire+0xbd/0x250
[  433.956637]  ? btrfs_clear_lock_blocking_rw+0x74/0x130 [btrfs]
[  433.956641]  rt_read_lock+0x47/0x60
[  433.956648]  ? btrfs_clear_lock_blocking_rw+0x74/0x130 [btrfs]
[  433.956654]  btrfs_clear_lock_blocking_rw+0x74/0x130 [btrfs]
[  433.956660]  btrfs_clear_path_blocking+0x99/0xc0 [btrfs]
[  433.956667]  btrfs_next_old_leaf+0x407/0x440 [btrfs]
[  433.956674]  btrfs_next_leaf+0x10/0x20 [btrfs]
[  433.956681]  btrfs_csum_file_blocks+0x31a/0x5f0 [btrfs]
[  433.956682]  ? migrate_enable+0x87/0x160
[  433.956690]  add_pending_csums.isra.46+0x4d/0x70 [btrfs]
[  433.956698]  btrfs_finish_ordered_io+0x30f/0x710 [btrfs]
[  433.956705]  finish_ordered_fn+0x15/0x20 [btrfs]
[  433.956714]  normal_work_helper+0x104/0x620 [btrfs]
[  433.956722]  btrfs_endio_write_helper+0x12/0x20 [btrfs]
[  433.956723]  process_one_work+0x1f0/0x700
[  433.956723]  ? process_one_work+0x171/0x700
[  433.956725]  worker_thread+0x171/0x530
[  433.956726]  kthread+0x10c/0x140
[  433.956727]  ? create_worker+0x1b0/0x1b0
[  433.956728]  ? kthread_park+0x90/0x90
[  433.956729]  ret_from_fork+0x31/0x40

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [btrfs/rt] lockdep false positive
  2017-01-22  8:46 [btrfs/rt] lockdep false positive Mike Galbraith
@ 2017-01-22 17:45 ` Mike Galbraith
  2017-01-22 18:25   ` Mike Galbraith
                     ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Mike Galbraith @ 2017-01-22 17:45 UTC (permalink / raw)
  To: LKML, linux-rt-users
  Cc: Peter Zijlstra, Thomas Gleixner, Sebastian Andrzej Siewior

On Sun, 2017-01-22 at 09:46 +0100, Mike Galbraith wrote:
> Greetings btrfs/lockdep wizards,
> 
> RT trees have trouble with the BTRFS lockdep positive avoidance lock
> class dance (see disk-io.c).  Seems the trouble is due to RT not having
> a means of telling lockdep that its rwlocks are recursive for read by
> the lock owner only, combined with the BTRFS lock class dance assuming
> that read_lock() is annotated rwlock_acquire_read(), which RT cannot
> do, as that would be a big fat lie.
> 
> Creating a rt_read_lock_shared() for btrfs_clear_lock_blocking_rw() did
> indeed make lockdep happy as a clam for test purposes.  (hm, submitting
> that would be excellent way to replenish frozen shark supply:)
> 
> Ideas?

Hrm.  The below seems to work fine, but /me strongly suspects that if
it were this damn trivial, the issue would be long dead.

RT does not have a way to describe its rwlock semantics to lockdep,
leading to the btrfs false positive below.  Btrfs maintains an array
of keys which it assigns on the fly in order to avoid false positives
in stock code, however, that scheme depends upon lockdep knowing that
read_lock()+read_lock() is allowed within a class, as multiple locks
are assigned to the same class, and end up acquired by the same task.

[  341.960754] =============================================
[  341.960754] [ INFO: possible recursive locking detected ]
[  341.960756] 4.10.0-rt1-rt #124 Tainted: G            E  
[  341.960756] ---------------------------------------------
[  341.960757] kworker/u8:9/2039 is trying to acquire lock:
[  341.960757]  (btrfs-tree-00){+.+...}, at: [<ffffffffa036fd15>] btrfs_clear_lock_blocking_rw+0x55/0x100 [btrfs]

This kworker assigned this lock to class 'tree' level 0 shortly
before acquisition, however..

[  341.960783] 
[  341.960783]  but task is already holding lock:
[  341.960783]  (btrfs-tree-00){+.+...}, at: [<ffffffffa036fd15>] btrfs_clear_lock_blocking_rw+0x55/0x100 [btrfs]

..another kworker previously assigned another lock we now hold to the
'tree' level 0 key as well.  Since RT tells lockdep that read_lock() is an
exclusive acquisition, in class read_lock()+read_lock() is forbidden.

[  341.960794]        CPU0
[  341.960795]        ----
[  341.960795]   lock(btrfs-tree-00);
[  341.960795]   lock(btrfs-tree-00);
[  341.960796] 
[  341.960796]  *** DEADLOCK ***
[  341.960796]
[  341.960796]  May be due to missing lock nesting notation
[  341.960796]
[  341.960796] 6 locks held by kworker/u8:9/2039:
[  341.960797]  #0:  ("%s-%s""btrfs", name){.+.+..}, at: [<ffffffff8109f711>] process_one_work+0x171/0x700
[  341.960812]  #1:  ((&work->normal_work)){+.+...}, at: [<ffffffff8109f711>] process_one_work+0x171/0x700
[  341.960815]  #2:  (sb_internal){.+.+..}, at: [<ffffffffa032d4f7>] start_transaction+0x2a7/0x5a0 [btrfs]
[  341.960825]  #3:  (btrfs-tree-02){+.+...}, at: [<ffffffffa036fd15>] btrfs_clear_lock_blocking_rw+0x55/0x100 [btrfs]
[  341.960835]  #4:  (btrfs-tree-01){+.+...}, at: [<ffffffffa036fd15>] btrfs_clear_lock_blocking_rw+0x55/0x100 [btrfs]
[  341.960854]  #5:  (btrfs-tree-00){+.+...}, at: [<ffffffffa036fd15>] btrfs_clear_lock_blocking_rw+0x55/0x100 [btrfs]

Attempting to describe RT rwlock semantics to lockdep prevents this.

Not-signed-off-by: /me
---
 include/linux/lockdep.h  |    5 +++++
 kernel/locking/lockdep.c |    8 ++++++++
 kernel/locking/rt.c      |    7 ++-----
 3 files changed, 15 insertions(+), 5 deletions(-)

--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -543,13 +543,18 @@ static inline void print_irqtrace_events
 #define lock_acquire_exclusive(l, s, t, n, i)		lock_acquire(l, s, t, 0, 1, n, i)
 #define lock_acquire_shared(l, s, t, n, i)		lock_acquire(l, s, t, 1, 1, n, i)
 #define lock_acquire_shared_recursive(l, s, t, n, i)	lock_acquire(l, s, t, 2, 1, n, i)
+#define lock_acquire_reader_recursive(l, s, t, n, i)	lock_acquire(l, s, t, 3, 1, n, i)
 
 #define spin_acquire(l, s, t, i)		lock_acquire_exclusive(l, s, t, NULL, i)
 #define spin_acquire_nest(l, s, t, n, i)	lock_acquire_exclusive(l, s, t, n, i)
 #define spin_release(l, n, i)			lock_release(l, n, i)
 
 #define rwlock_acquire(l, s, t, i)		lock_acquire_exclusive(l, s, t, NULL, i)
+#ifndef CONFIG_PREEMPT_RT_FULL
 #define rwlock_acquire_read(l, s, t, i)		lock_acquire_shared_recursive(l, s, t, NULL, i)
+#else
+#define rwlock_acquire_read(l, s, t, i)		lock_acquire_reader_recursive(l, s, t, NULL, i)
+#endif
 #define rwlock_release(l, n, i)			lock_release(l, n, i)
 
 #define seqcount_acquire(l, s, t, i)		lock_acquire_exclusive(l, s, t, NULL, i)
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1761,6 +1761,14 @@ check_deadlock(struct task_struct *curr,
 		if ((read == 2) && prev->read)
 			return 2;
 
+#ifdef CONFIG_PREEMPT_RT_FULL
+		/*
+		 * Allow read-after-read or read-after-write recursion of the
+		 * same lock class for RT rwlocks.
+		 */
+		if (read == 3 && (prev->read == 3 || prev->read == 0))
+			return 2;
+#endif
 		/*
 		 * We're holding the nest_lock, which serializes this lock's
 		 * nesting behaviour.
--- a/kernel/locking/rt.c
+++ b/kernel/locking/rt.c
@@ -265,17 +265,14 @@ void __lockfunc rt_read_lock(rwlock_t *r
 {
 	struct rt_mutex *lock = &rwlock->lock;
 
-
+	rwlock_acquire_read(&rwlock->dep_map, 0, 0, _RET_IP_);
 	/*
 	 * recursive read locks succeed when current owns the lock
 	 */
-	if (rt_mutex_owner(lock) != current) {
-		rwlock_acquire(&rwlock->dep_map, 0, 0, _RET_IP_);
+	if (rt_mutex_owner(lock) != current)
 		__rt_spin_lock(lock);
-	}
 	rwlock->read_depth++;
 }
-
 EXPORT_SYMBOL(rt_read_lock);
 
 void __lockfunc rt_write_unlock(rwlock_t *rwlock)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [btrfs/rt] lockdep false positive
  2017-01-22 17:45 ` Mike Galbraith
@ 2017-01-22 18:25   ` Mike Galbraith
  2017-01-23  5:23   ` Mike Galbraith
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Mike Galbraith @ 2017-01-22 18:25 UTC (permalink / raw)
  To: LKML, linux-rt-users
  Cc: Peter Zijlstra, Thomas Gleixner, Sebastian Andrzej Siewior

 
> +> 	> 	> /*
> +> 	> 	>  * Allow read-after-read or read-after-write recursion of the
> +> 	> 	>  * same lock class for RT rwlocks.
> +> 	> 	>  */
> +> 	> 	> if (read == 3 && (prev->read == 3 || prev->read == 0))

Pff, shoulda left it reader vs reader.. but it's gotta be wrong anyway.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [btrfs/rt] lockdep false positive
  2017-01-22 17:45 ` Mike Galbraith
  2017-01-22 18:25   ` Mike Galbraith
@ 2017-01-23  5:23   ` Mike Galbraith
  2017-01-23  9:07   ` Peter Zijlstra
  2017-01-25 17:02   ` Sebastian Andrzej Siewior
  3 siblings, 0 replies; 9+ messages in thread
From: Mike Galbraith @ 2017-01-23  5:23 UTC (permalink / raw)
  To: LKML, linux-rt-users
  Cc: Peter Zijlstra, Thomas Gleixner, Sebastian Andrzej Siewior

On Sun, 2017-01-22 at 18:45 +0100, Mike Galbraith wrote:
> On Sun, 2017-01-22 at 09:46 +0100, Mike Galbraith wrote:
> > Greetings btrfs/lockdep wizards,
> > 
> > RT trees have trouble with the BTRFS lockdep positive avoidance lock
> > class dance (see disk-io.c).  Seems the trouble is due to RT not having
> > a means of telling lockdep that its rwlocks are recursive for read by
> > the lock owner only, combined with the BTRFS lock class dance assuming
> > that read_lock() is annotated rwlock_acquire_read(), which RT cannot
> > do, as that would be a big fat lie.
> > 
> > Creating a rt_read_lock_shared() for btrfs_clear_lock_blocking_rw() did
> > indeed make lockdep happy as a clam for test purposes.  (hm, submitting
> > that would be excellent way to replenish frozen shark supply:)
> > 
> > Ideas?
> 
> Hrm.  The below seems to work fine, but /me strongly suspects that if
> it were this damn trivial, the issue would be long dead.

(iow, did I merely spell '2' as '3' vs creating the annotation I want)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [btrfs/rt] lockdep false positive
  2017-01-22 17:45 ` Mike Galbraith
  2017-01-22 18:25   ` Mike Galbraith
  2017-01-23  5:23   ` Mike Galbraith
@ 2017-01-23  9:07   ` Peter Zijlstra
  2017-01-25 17:02   ` Sebastian Andrzej Siewior
  3 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2017-01-23  9:07 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: LKML, linux-rt-users, Thomas Gleixner, Sebastian Andrzej Siewior

On Sun, Jan 22, 2017 at 06:45:14PM +0100, Mike Galbraith wrote:
> On Sun, 2017-01-22 at 09:46 +0100, Mike Galbraith wrote:
> > Greetings btrfs/lockdep wizards,
> > 
> > RT trees have trouble with the BTRFS lockdep positive avoidance lock
> > class dance (see disk-io.c).  Seems the trouble is due to RT not having
> > a means of telling lockdep that its rwlocks are recursive for read by
> > the lock owner only, combined with the BTRFS lock class dance assuming
> > that read_lock() is annotated rwlock_acquire_read(), which RT cannot
> > do, as that would be a big fat lie.
> > 
> > Creating a rt_read_lock_shared() for btrfs_clear_lock_blocking_rw() did
> > indeed make lockdep happy as a clam for test purposes.  (hm, submitting
> > that would be excellent way to replenish frozen shark supply:)
> > 
> > Ideas?

Not having looked at anything much, currently lockdep does not in fact
model rwlock properly as is.

Note that rwlock is _not_ in fact reader biased like it used to be, that
is, read_lock() will block if there is a pending writer (just like
rwsem), with the exception when read_lock() happend in_interrupt()
(because tasklist_lock).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [btrfs/rt] lockdep false positive
  2017-01-22 17:45 ` Mike Galbraith
                     ` (2 preceding siblings ...)
  2017-01-23  9:07   ` Peter Zijlstra
@ 2017-01-25 17:02   ` Sebastian Andrzej Siewior
  2017-01-25 18:29     ` Mike Galbraith
  3 siblings, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2017-01-25 17:02 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: LKML, linux-rt-users, Peter Zijlstra, Thomas Gleixner

On 2017-01-22 18:45:14 [+0100], Mike Galbraith wrote:
> RT does not have a way to describe its rwlock semantics to lockdep,
> leading to the btrfs false positive below.  Btrfs maintains an array
> of keys which it assigns on the fly in order to avoid false positives
> in stock code, however, that scheme depends upon lockdep knowing that
> read_lock()+read_lock() is allowed within a class, as multiple locks
> are assigned to the same class, and end up acquired by the same task.

read_lock(A)+read_lock(A) of the same lock is okay because the lock is
already held and a writer is blocked. Lockdep won't see the second lock.
That means the second read_lock() is also successful if we have a writer
waiting after the first read_lock().

> [  341.960754] =============================================
> [  341.960754] [ INFO: possible recursive locking detected ]
> [  341.960756] 4.10.0-rt1-rt #124 Tainted: G            E  
> [  341.960756] ---------------------------------------------
> [  341.960757] kworker/u8:9/2039 is trying to acquire lock:
> [  341.960757]  (btrfs-tree-00){+.+...}, at: [<ffffffffa036fd15>] btrfs_clear_lock_blocking_rw+0x55/0x100 [btrfs]
> 
> This kworker assigned this lock to class 'tree' level 0 shortly
> before acquisition, however..
> 
> [  341.960783] 
> [  341.960783]  but task is already holding lock:
> [  341.960783]  (btrfs-tree-00){+.+...}, at: [<ffffffffa036fd15>] btrfs_clear_lock_blocking_rw+0x55/0x100 [btrfs]
> 
> ..another kworker previously assigned another lock we now hold to the
> 'tree' level 0 key as well.  Since RT tells lockdep that read_lock() is an
> exclusive acquisition, in class read_lock()+read_lock() is forbidden.

Hmm. So if you have kworker1 doing
	read_lock(A), read_lock(B)
and kworker2 doing
	read_lock(B), read_lock(A)

then this is something that will work fine on mainline (assuming that
neither B nor A is write-locked by something that depends one something
that is done / held by kworker1 or kworker2). However on -RT it might
deadlock because a read lock can only be taken recursively on -RT. That
means you can't have two kworkers holding the same lock at the time. One
of them will be blocked until reader lock is release. For the
non-recursively case a reader-lock on -RT behaves like a "normal" lock.
So yes, it is an exclusive acquisition.

> [  341.960794]        CPU0
> [  341.960795]        ----
> [  341.960795]   lock(btrfs-tree-00);
> [  341.960795]   lock(btrfs-tree-00);
> [  341.960796] 
> [  341.960796]  *** DEADLOCK ***
> [  341.960796]
> [  341.960796]  May be due to missing lock nesting notation
> [  341.960796]
> [  341.960796] 6 locks held by kworker/u8:9/2039:
> [  341.960797]  #0:  ("%s-%s""btrfs", name){.+.+..}, at: [<ffffffff8109f711>] process_one_work+0x171/0x700
> [  341.960812]  #1:  ((&work->normal_work)){+.+...}, at: [<ffffffff8109f711>] process_one_work+0x171/0x700
> [  341.960815]  #2:  (sb_internal){.+.+..}, at: [<ffffffffa032d4f7>] start_transaction+0x2a7/0x5a0 [btrfs]
> [  341.960825]  #3:  (btrfs-tree-02){+.+...}, at: [<ffffffffa036fd15>] btrfs_clear_lock_blocking_rw+0x55/0x100 [btrfs]
> [  341.960835]  #4:  (btrfs-tree-01){+.+...}, at: [<ffffffffa036fd15>] btrfs_clear_lock_blocking_rw+0x55/0x100 [btrfs]
> [  341.960854]  #5:  (btrfs-tree-00){+.+...}, at: [<ffffffffa036fd15>] btrfs_clear_lock_blocking_rw+0x55/0x100 [btrfs]
> 
> Attempting to describe RT rwlock semantics to lockdep prevents this.

and this is what I don't get. I stumbled upon this myself [0] but didn't
fully understand the problem (assuming this is the same problem colored
differently).
With your explanation I am not sure if I get what is happening. If btrfs
is taking here read-locks on random locks then it may deadlock if
another btrfs-thread is doing the same and need each other's locks.
If btrfs takes locks recursively which it already holds (in the same
context / process) then it shouldn't be visible here because lockdep
does not account this on -RT.
If btrfs takes the locks in a special order for instance only ascending
according to inode's number then it shouldn't deadlock.

[0] https://www.spinics.net/lists/linux-btrfs/msg61423.html

> 
> Not-signed-off-by: /me

Sebastian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [btrfs/rt] lockdep false positive
  2017-01-25 17:02   ` Sebastian Andrzej Siewior
@ 2017-01-25 18:29     ` Mike Galbraith
  2017-01-26 17:09       ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 9+ messages in thread
From: Mike Galbraith @ 2017-01-25 18:29 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: LKML, linux-rt-users, Peter Zijlstra, Thomas Gleixner

On Wed, 2017-01-25 at 18:02 +0100, Sebastian Andrzej Siewior wrote:

> > [  341.960794]        CPU0
> > [  341.960795]        ----
> > [  341.960795]   lock(btrfs-tree-00);
> > [  341.960795]   lock(btrfs-tree-00);
> > [  341.960796] 
> > [  341.960796]  *** DEADLOCK ***
> > [  341.960796]
> > [  341.960796]  May be due to missing lock nesting notation
> > [  341.960796]
> > [  341.960796] 6 locks held by kworker/u8:9/2039:
> > [  341.960797]  #0:  ("%s-%s""btrfs", name){.+.+..}, at: [] process_one_work+0x171/0x700
> > [  341.960812]  #1:  ((&work->normal_work)){+.+...}, at: [] process_one_work+0x171/0x700
> > [  341.960815]  #2:  (sb_internal){.+.+..}, at: [] start_transaction+0x2a7/0x5a0 [btrfs]
> > [  341.960825]  #3:  (btrfs-tree-02){+.+...}, at: [] btrfs_clear_lock_blocking_rw+0x55/0x100 [btrfs]
> > [  341.960835]  #4:  (btrfs-tree-01){+.+...}, at: [] btrfs_clear_lock_blocking_rw+0x55/0x100 [btrfs]
> > [  341.960854]  #5:  (btrfs-tree-00){+.+...}, at: [] btrfs_clear_lock_blocking_rw+0x55/0x100 [btrfs]
> > 
> > Attempting to describe RT rwlock semantics to lockdep prevents this.
> 
> and this is what I don't get. I stumbled upon this myself [0] but didn't
> fully understand the problem (assuming this is the same problem colored
> differently).

Yeah, [0] looks like it, though I haven't met an 'fs' variant, my
encounters were always either 'tree' or 'csum' flavors.

> With your explanation I am not sure if I get what is happening. If btrfs
> is taking here read-locks on random locks then it may deadlock if
> another btrfs-thread is doing the same and need each other's locks.

I don't know if a real RT deadlock is possible.  I haven't met one,
only variants of this bogus recursion gripe.
 
> If btrfs takes locks recursively which it already holds (in the same
> context / process) then it shouldn't be visible here because lockdep
> does not account this on -RT.

If what lockdep gripes about were true, we would never see the splat,
we'd zip straight through that (illusion) recursive read_lock() with
lockdep being none the wiser. 

> If btrfs takes the locks in a special order for instance only ascending
> according to inode's number then it shouldn't deadlock.

No idea.  Locking fancy enough to require dynamic key assignment to
appease lockdep is too fancy for me.

	-Mike

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [btrfs/rt] lockdep false positive
  2017-01-25 18:29     ` Mike Galbraith
@ 2017-01-26 17:09       ` Sebastian Andrzej Siewior
  2017-01-26 18:01         ` Mike Galbraith
  0 siblings, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2017-01-26 17:09 UTC (permalink / raw)
  To: Mike Galbraith, linux-btrfs, David Sterba, Josef Bacik, Chris Mason
  Cc: LKML, linux-rt-users, Peter Zijlstra, Thomas Gleixner

On 2017-01-25 19:29:49 [+0100], Mike Galbraith wrote:
> On Wed, 2017-01-25 at 18:02 +0100, Sebastian Andrzej Siewior wrote:
> 
> > > [  341.960794]        CPU0
> > > [  341.960795]        ----
> > > [  341.960795]   lock(btrfs-tree-00);
> > > [  341.960795]   lock(btrfs-tree-00);
> > > [  341.960796] 
> > > [  341.960796]  *** DEADLOCK ***
> > > [  341.960796]
> > > [  341.960796]  May be due to missing lock nesting notation
> > > [  341.960796]
> > > [  341.960796] 6 locks held by kworker/u8:9/2039:
> > > [  341.960797]  #0:  ("%s-%s""btrfs", name){.+.+..}, at: [] process_one_work+0x171/0x700
> > > [  341.960812]  #1:  ((&work->normal_work)){+.+...}, at: [] process_one_work+0x171/0x700
> > > [  341.960815]  #2:  (sb_internal){.+.+..}, at: [] start_transaction+0x2a7/0x5a0 [btrfs]
> > > [  341.960825]  #3:  (btrfs-tree-02){+.+...}, at: [] btrfs_clear_lock_blocking_rw+0x55/0x100 [btrfs]
> > > [  341.960835]  #4:  (btrfs-tree-01){+.+...}, at: [] btrfs_clear_lock_blocking_rw+0x55/0x100 [btrfs]
> > > [  341.960854]  #5:  (btrfs-tree-00){+.+...}, at: [] btrfs_clear_lock_blocking_rw+0x55/0x100 [btrfs]
> > > 
> > > Attempting to describe RT rwlock semantics to lockdep prevents this.
> > 
> > and this is what I don't get. I stumbled upon this myself [0] but didn't
> > fully understand the problem (assuming this is the same problem colored
> > differently).
> 
> Yeah, [0] looks like it, though I haven't met an 'fs' variant, my
> encounters were always either 'tree' or 'csum' flavors.
> 
> > With your explanation I am not sure if I get what is happening. If btrfs
> > is taking here read-locks on random locks then it may deadlock if
> > another btrfs-thread is doing the same and need each other's locks.
> 
> I don't know if a real RT deadlock is possible.  I haven't met one,
> only variants of this bogus recursion gripe.
>  
> > If btrfs takes locks recursively which it already holds (in the same
> > context / process) then it shouldn't be visible here because lockdep
> > does not account this on -RT.
> 
> If what lockdep gripes about were true, we would never see the splat,
> we'd zip straight through that (illusion) recursive read_lock() with
> lockdep being none the wiser. 
> 
> > If btrfs takes the locks in a special order for instance only ascending
> > according to inode's number then it shouldn't deadlock.
> 
> No idea.  Locking fancy enough to require dynamic key assignment to
> appease lockdep is too fancy for me.

yup, for me, too. As long as nobody from the btrfs camp explains how
that locking workings and if it is safe I am not feeling comfortable to
shut up lockdep here.

> 	-Mike

Sebastian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [btrfs/rt] lockdep false positive
  2017-01-26 17:09       ` Sebastian Andrzej Siewior
@ 2017-01-26 18:01         ` Mike Galbraith
  0 siblings, 0 replies; 9+ messages in thread
From: Mike Galbraith @ 2017-01-26 18:01 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, linux-btrfs, David Sterba,
	Josef Bacik, Chris Mason
  Cc: LKML, linux-rt-users, Peter Zijlstra, Thomas Gleixner

On Thu, 2017-01-26 at 18:09 +0100, Sebastian Andrzej Siewior wrote:
> On 2017-01-25 19:29:49 [+0100], Mike Galbraith wrote:
> > On Wed, 2017-01-25 at 18:02 +0100, Sebastian Andrzej Siewior wrote:
> > 
> > > > [  341.960794]        CPU0
> > > > [  341.960795]        ----
> > > > [  341.960795]   lock(btrfs-tree-00);
> > > > [  341.960795]   lock(btrfs-tree-00);
> > > > [  341.960796] 
> > > > [  341.960796]  *** DEADLOCK ***
> > > > [  341.960796]
> > > > [  341.960796]  May be due to missing lock nesting notation
> > > > [  341.960796]
> > > > [  341.960796] 6 locks held by kworker/u8:9/2039:
> > > > [  341.960797]  #0:  ("%s-%s""btrfs", name){.+.+..}, at: [] process_one_work+0x171/0x700
> > > > [  341.960812]  #1:  ((&work->normal_work)){+.+...}, at: [] process_one_work+0x171/0x700
> > > > [  341.960815]  #2:  (sb_internal){.+.+..}, at: [] start_transaction+0x2a7/0x5a0 [btrfs]
> > > > [  341.960825]  #3:  (btrfs-tree-02){+.+...}, at: [] btrfs_clear_lock_blocking_rw+0x55/0x100 [btrfs]
> > > > [  341.960835]  #4:  (btrfs-tree-01){+.+...}, at: [] btrfs_clear_lock_blocking_rw+0x55/0x100 [btrfs]
> > > > [  341.960854]  #5:  (btrfs-tree-00){+.+...}, at: [] btrfs_clear_lock_blocking_rw+0x55/0x100 [btrfs]
> > > > 
> > > > Attempting to describe RT rwlock semantics to lockdep prevents this.
> > > 
> > > and this is what I don't get. I stumbled upon this myself [0] but didn't
> > > fully understand the problem (assuming this is the same problem colored
> > > differently).
> > 
> > Yeah, [0] looks like it, though I haven't met an 'fs' variant, my
> > encounters were always either 'tree' or 'csum' flavors.
> > 
> > > With your explanation I am not sure if I get what is happening. If btrfs
> > > is taking here read-locks on random locks then it may deadlock if
> > > another btrfs-thread is doing the same and need each other's locks.
> > 
> > I don't know if a real RT deadlock is possible.  I haven't met one,
> > only variants of this bogus recursion gripe.
> >  
> > > If btrfs takes locks recursively which it already holds (in the same
> > > context / process) then it shouldn't be visible here because lockdep
> > > does not account this on -RT.
> > 
> > If what lockdep gripes about were true, we would never see the splat,
> > we'd zip straight through that (illusion) recursive read_lock() with
> > lockdep being none the wiser. 
> > 
> > > If btrfs takes the locks in a special order for instance only ascending
> > > according to inode's number then it shouldn't deadlock.
> > 
> > No idea.  Locking fancy enough to require dynamic key assignment to
> > appease lockdep is too fancy for me.
> 
> yup, for me, too. As long as nobody from the btrfs camp explains how
> that locking workings and if it is safe I am not feeling comfortable to
> shut up lockdep here.

Works for me.  What we're talking about is an obvious false positive in
one and only one contrived situation.  It's annoying/sub-optimal, but
happily has no (known) impact other than testing, and that's trivial to
remedy.

	-Mike

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-01-26 18:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-22  8:46 [btrfs/rt] lockdep false positive Mike Galbraith
2017-01-22 17:45 ` Mike Galbraith
2017-01-22 18:25   ` Mike Galbraith
2017-01-23  5:23   ` Mike Galbraith
2017-01-23  9:07   ` Peter Zijlstra
2017-01-25 17:02   ` Sebastian Andrzej Siewior
2017-01-25 18:29     ` Mike Galbraith
2017-01-26 17:09       ` Sebastian Andrzej Siewior
2017-01-26 18:01         ` Mike Galbraith

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.