All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/3] Cleanup waitqueue_active and barriers
@ 2018-05-14 12:23 David Sterba
  2018-05-14 12:23 ` [PATCH v3 1/3] btrfs: introduce conditional wakeup helpers David Sterba
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: David Sterba @ 2018-05-14 12:23 UTC (permalink / raw)
  To: linux-btrfs; +Cc: David Sterba

Reduce number of standalone barriers before waitqueue_active calls.

Changes v3:
* fix wrong use of the _nomb variant in tree-log.c:btrfs_sync_log,
* update comments to be more specific about the waitqueue_active and
  barrier

Changes v2:
* add 2 barriers to btrfs_sync_log and do not assume they're implied,
  (pointed out by Nikolay)

git://github.com/kdave/btrfs-devel.git cleanup/cond-wake

David Sterba (3):
  btrfs: introduce conditional wakeup helpers
  btrfs: add barriers to btrfs_sync_log before log_commit_wait wakeups
  btrfs: replace waitqueue_actvie with cond_wake_up

 fs/btrfs/compression.c   |  7 +------
 fs/btrfs/ctree.h         | 22 ++++++++++++++++++++++
 fs/btrfs/delayed-inode.c |  9 +++------
 fs/btrfs/dev-replace.c   | 10 ++++------
 fs/btrfs/extent-tree.c   |  7 +------
 fs/btrfs/inode.c         |  9 +++------
 fs/btrfs/locking.c       | 34 +++++++++++-----------------------
 fs/btrfs/ordered-data.c  | 14 ++++----------
 fs/btrfs/transaction.c   |  7 +------
 fs/btrfs/tree-log.c      | 28 ++++++++++++----------------
 10 files changed, 62 insertions(+), 85 deletions(-)

-- 
2.16.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3 1/3] btrfs: introduce conditional wakeup helpers
  2018-05-14 12:23 [PATCH v3 0/3] Cleanup waitqueue_active and barriers David Sterba
@ 2018-05-14 12:23 ` David Sterba
  2018-05-14 12:23 ` [PATCH v3 2/3] btrfs: add barriers to btrfs_sync_log before log_commit_wait wakeups David Sterba
  2018-05-14 12:23 ` [PATCH v3 3/3] btrfs: replace waitqueue_actvie with cond_wake_up David Sterba
  2 siblings, 0 replies; 6+ messages in thread
From: David Sterba @ 2018-05-14 12:23 UTC (permalink / raw)
  To: linux-btrfs; +Cc: David Sterba

Add convenience wrappers for the waitqueue management that involves
memory barriers to prevent deadlocks. The helpers will let us remove
barriers and the necessary comments in several places.

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
---
 fs/btrfs/ctree.h | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 2771cc56a622..1389942f3be1 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3765,4 +3765,26 @@ static inline int btrfs_is_testing(struct btrfs_fs_info *fs_info)
 	return 0;
 }
 
+static inline void cond_wake_up(struct wait_queue_head *wq)
+{
+	/*
+	 * This implies a full smp_mb barrier, see comments for
+	 * waitqueue_active why.
+	 */
+	if (wq_has_sleeper(wq))
+		wake_up(wq);
+}
+
+static inline void cond_wake_up_nomb(struct wait_queue_head *wq)
+{
+	/*
+	 * Special case for conditional wakeup where the barrier required for
+	 * waitqueue_active is implied by some of the preceding code. Eg. one
+	 * of such atomic operations (atomic_dec_and_return, ...), or a
+	 * unlock/lock sequence, etc.
+	 */
+	if (waitqueue_active(wq))
+		wake_up(wq);
+}
+
 #endif
-- 
2.16.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v3 2/3] btrfs: add barriers to btrfs_sync_log before log_commit_wait wakeups
  2018-05-14 12:23 [PATCH v3 0/3] Cleanup waitqueue_active and barriers David Sterba
  2018-05-14 12:23 ` [PATCH v3 1/3] btrfs: introduce conditional wakeup helpers David Sterba
@ 2018-05-14 12:23 ` David Sterba
  2018-05-14 14:41   ` Nikolay Borisov
  2018-05-14 12:23 ` [PATCH v3 3/3] btrfs: replace waitqueue_actvie with cond_wake_up David Sterba
  2 siblings, 1 reply; 6+ messages in thread
From: David Sterba @ 2018-05-14 12:23 UTC (permalink / raw)
  To: linux-btrfs; +Cc: David Sterba, Nikolay Borisov

Currently the code assumes that there's an implied barrier by the
sequence of code preceding the wakeup, namely the mutex unlock.

As Nikolay pointed out:

I think this is wrong (not your code) but the original assumption that
the RELEASE semantics provided by mutex_unlock is sufficient.
According to memory-barriers.txt:

Section 'LOCK ACQUISITION FUNCTIONS' states:

 (2) RELEASE operation implication:

     Memory operations issued before the RELEASE will be completed before the
     RELEASE operation has completed.

     Memory operations issued after the RELEASE *may* be completed before the
     RELEASE operation has completed.

(I've bolded the may portion)

The example given there:

As an example, consider the following:

    *A = a;
    *B = b;
    ACQUIRE
    *C = c;
    *D = d;
    RELEASE
    *E = e;
    *F = f;

The following sequence of events is acceptable:

    ACQUIRE, {*F,*A}, *E, {*C,*D}, *B, RELEASE

So if we assume that *C is modifying the flag which the waitqueue is checking,
and *E is the actual wakeup, then those accesses can be re-ordered...

IMHO this code should be considered broken...
---

To be on the safe side, add the barriers. The synchronization logic
around log using the mutexes and several other threads does not make it
easy to reason for/against the barrier.

CC: Nikolay Borisov <nborisov@suse.com>
Link: https://lkml.kernel.org/r/6ee068d8-1a69-3728-00d1-d86293d43c9f@suse.com
Signed-off-by: David Sterba <dsterba@suse.com>
---
 fs/btrfs/tree-log.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 43758e30aa7a..fa5b3dc5f4d5 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3116,8 +3116,11 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
 	mutex_unlock(&log_root_tree->log_mutex);
 
 	/*
-	 * The barrier before waitqueue_active is implied by mutex_unlock
+	 * The barrier before waitqueue_active is needed so all the updates
+	 * above are seen by the woken threads. It might not be necessary, but
+	 * proving that seems to be hard.
 	 */
+	smp_mb();
 	if (waitqueue_active(&log_root_tree->log_commit_wait[index2]))
 		wake_up(&log_root_tree->log_commit_wait[index2]);
 out:
@@ -3128,8 +3131,11 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
 	mutex_unlock(&root->log_mutex);
 
 	/*
-	 * The barrier before waitqueue_active is implied by mutex_unlock
+	 * The barrier before waitqueue_active is needed so all the updates
+	 * above are seen by the woken threads. It might not be necessary, but
+	 * proving that seems to be hard.
 	 */
+	smp_mb();
 	if (waitqueue_active(&root->log_commit_wait[index1]))
 		wake_up(&root->log_commit_wait[index1]);
 	return ret;
-- 
2.16.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v3 3/3] btrfs: replace waitqueue_actvie with cond_wake_up
  2018-05-14 12:23 [PATCH v3 0/3] Cleanup waitqueue_active and barriers David Sterba
  2018-05-14 12:23 ` [PATCH v3 1/3] btrfs: introduce conditional wakeup helpers David Sterba
  2018-05-14 12:23 ` [PATCH v3 2/3] btrfs: add barriers to btrfs_sync_log before log_commit_wait wakeups David Sterba
@ 2018-05-14 12:23 ` David Sterba
  2018-05-14 15:42   ` Nikolay Borisov
  2 siblings, 1 reply; 6+ messages in thread
From: David Sterba @ 2018-05-14 12:23 UTC (permalink / raw)
  To: linux-btrfs; +Cc: David Sterba

Use the wrappers and reduce the amount of low-level details about the
waitqueue management.

Signed-off-by: David Sterba <dsterba@suse.com>
---
 fs/btrfs/compression.c   |  7 +------
 fs/btrfs/delayed-inode.c |  9 +++------
 fs/btrfs/dev-replace.c   | 10 ++++------
 fs/btrfs/extent-tree.c   |  7 +------
 fs/btrfs/inode.c         |  9 +++------
 fs/btrfs/locking.c       | 34 +++++++++++-----------------------
 fs/btrfs/ordered-data.c  | 14 ++++----------
 fs/btrfs/transaction.c   |  7 +------
 fs/btrfs/tree-log.c      | 34 ++++++++++++----------------------
 9 files changed, 40 insertions(+), 91 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 5268c9f85ca7..dcebb91e4e0c 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -990,12 +990,7 @@ static void __free_workspace(int type, struct list_head *workspace,
 		btrfs_compress_op[idx]->free_workspace(workspace);
 	atomic_dec(total_ws);
 wake:
-	/*
-	 * Make sure counter is updated before we wake up waiters.
-	 */
-	smp_mb();
-	if (waitqueue_active(ws_wait))
-		wake_up(ws_wait);
+	cond_wake_up(ws_wait);
 }
 
 static void free_workspace(int type, struct list_head *ws)
diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
index a8d492dbd3e7..fe6caa7e698b 100644
--- a/fs/btrfs/delayed-inode.c
+++ b/fs/btrfs/delayed-inode.c
@@ -460,13 +460,10 @@ static void finish_one_item(struct btrfs_delayed_root *delayed_root)
 {
 	int seq = atomic_inc_return(&delayed_root->items_seq);
 
-	/*
-	 * atomic_dec_return implies a barrier for waitqueue_active
-	 */
+	/* atomic_dec_return implies a barrier */
 	if ((atomic_dec_return(&delayed_root->items) <
-	    BTRFS_DELAYED_BACKGROUND || seq % BTRFS_DELAYED_BATCH == 0) &&
-	    waitqueue_active(&delayed_root->wait))
-		wake_up(&delayed_root->wait);
+	    BTRFS_DELAYED_BACKGROUND || seq % BTRFS_DELAYED_BATCH == 0))
+		cond_wake_up_nomb(&delayed_root->wait);
 }
 
 static void __btrfs_remove_delayed_item(struct btrfs_delayed_item *delayed_item)
diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index f82be266ba4b..d33ff5b23302 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -916,9 +916,9 @@ void btrfs_dev_replace_clear_lock_blocking(
 	ASSERT(atomic_read(&dev_replace->read_locks) > 0);
 	ASSERT(atomic_read(&dev_replace->blocking_readers) > 0);
 	read_lock(&dev_replace->lock);
-	if (atomic_dec_and_test(&dev_replace->blocking_readers) &&
-	    waitqueue_active(&dev_replace->read_lock_wq))
-		wake_up(&dev_replace->read_lock_wq);
+	/* Barrier implied by atomic_dec_and_test */
+	if (atomic_dec_and_test(&dev_replace->blocking_readers))
+		cond_wake_up_nomb(&dev_replace->read_lock_wq);
 }
 
 void btrfs_bio_counter_inc_noblocked(struct btrfs_fs_info *fs_info)
@@ -929,9 +929,7 @@ void btrfs_bio_counter_inc_noblocked(struct btrfs_fs_info *fs_info)
 void btrfs_bio_counter_sub(struct btrfs_fs_info *fs_info, s64 amount)
 {
 	percpu_counter_sub(&fs_info->bio_counter, amount);
-
-	if (waitqueue_active(&fs_info->replace_wait))
-		wake_up(&fs_info->replace_wait);
+	cond_wake_up_nomb(&fs_info->replace_wait);
 }
 
 void btrfs_bio_counter_inc_blocked(struct btrfs_fs_info *fs_info)
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 3871658b6ab1..94fc825a126d 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -11094,12 +11094,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
 void btrfs_end_write_no_snapshotting(struct btrfs_root *root)
 {
 	percpu_counter_dec(&root->subv_writers->counter);
-	/*
-	 * Make sure counter is updated before we wake up waiters.
-	 */
-	smp_mb();
-	if (waitqueue_active(&root->subv_writers->wait))
-		wake_up(&root->subv_writers->wait);
+	cond_wake_up(&root->subv_writers->wait);
 }
 
 int btrfs_start_write_no_snapshotting(struct btrfs_root *root)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index d241285a0d2a..cf6f2815ffee 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1156,13 +1156,10 @@ static noinline void async_cow_submit(struct btrfs_work *work)
 	nr_pages = (async_cow->end - async_cow->start + PAGE_SIZE) >>
 		PAGE_SHIFT;
 
-	/*
-	 * atomic_sub_return implies a barrier for waitqueue_active
-	 */
+	/* atomic_sub_return implies a barrier */
 	if (atomic_sub_return(nr_pages, &fs_info->async_delalloc_pages) <
-	    5 * SZ_1M &&
-	    waitqueue_active(&fs_info->async_submit_wait))
-		wake_up(&fs_info->async_submit_wait);
+	    5 * SZ_1M)
+		cond_wake_up_nomb(&fs_info->async_submit_wait);
 
 	if (async_cow->inode)
 		submit_compressed_extents(async_cow->inode, async_cow);
diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c
index e4faefac9d16..1da768e5ef75 100644
--- a/fs/btrfs/locking.c
+++ b/fs/btrfs/locking.c
@@ -66,22 +66,16 @@ void btrfs_clear_lock_blocking_rw(struct extent_buffer *eb, int rw)
 		write_lock(&eb->lock);
 		WARN_ON(atomic_read(&eb->spinning_writers));
 		atomic_inc(&eb->spinning_writers);
-		/*
-		 * atomic_dec_and_test implies a barrier for waitqueue_active
-		 */
-		if (atomic_dec_and_test(&eb->blocking_writers) &&
-		    waitqueue_active(&eb->write_lock_wq))
-			wake_up(&eb->write_lock_wq);
+		/* atomic_dec_and_test implies a barrier */
+		if (atomic_dec_and_test(&eb->blocking_writers))
+			cond_wake_up_nomb(&eb->write_lock_wq);
 	} else if (rw == BTRFS_READ_LOCK_BLOCKING) {
 		BUG_ON(atomic_read(&eb->blocking_readers) == 0);
 		read_lock(&eb->lock);
 		atomic_inc(&eb->spinning_readers);
-		/*
-		 * atomic_dec_and_test implies a barrier for waitqueue_active
-		 */
-		if (atomic_dec_and_test(&eb->blocking_readers) &&
-		    waitqueue_active(&eb->read_lock_wq))
-			wake_up(&eb->read_lock_wq);
+		/* atomic_dec_and_test implies a barrier */
+		if (atomic_dec_and_test(&eb->blocking_readers))
+			cond_wake_up_nomb(&eb->read_lock_wq);
 	}
 }
 
@@ -221,12 +215,9 @@ void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb)
 	}
 	btrfs_assert_tree_read_locked(eb);
 	WARN_ON(atomic_read(&eb->blocking_readers) == 0);
-	/*
-	 * atomic_dec_and_test implies a barrier for waitqueue_active
-	 */
-	if (atomic_dec_and_test(&eb->blocking_readers) &&
-	    waitqueue_active(&eb->read_lock_wq))
-		wake_up(&eb->read_lock_wq);
+	/* atomic_dec_and_test implies a barrier */
+	if (atomic_dec_and_test(&eb->blocking_readers))
+		cond_wake_up_nomb(&eb->read_lock_wq);
 	atomic_dec(&eb->read_locks);
 }
 
@@ -275,12 +266,9 @@ void btrfs_tree_unlock(struct extent_buffer *eb)
 	if (blockers) {
 		WARN_ON(atomic_read(&eb->spinning_writers));
 		atomic_dec(&eb->blocking_writers);
-		/*
-		 * Make sure counter is updated before we wake up waiters.
-		 */
+		/* Use the lighter barrier after atomic */
 		smp_mb__after_atomic();
-		if (waitqueue_active(&eb->write_lock_wq))
-			wake_up(&eb->write_lock_wq);
+		cond_wake_up_nomb(&eb->write_lock_wq);
 	} else {
 		WARN_ON(atomic_read(&eb->spinning_writers) != 1);
 		atomic_dec(&eb->spinning_writers);
diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
index 6db8bb2f2c28..2e1a1694a33d 100644
--- a/fs/btrfs/ordered-data.c
+++ b/fs/btrfs/ordered-data.c
@@ -343,11 +343,8 @@ int btrfs_dec_test_first_ordered_pending(struct inode *inode,
 
 	if (entry->bytes_left == 0) {
 		ret = test_and_set_bit(BTRFS_ORDERED_IO_DONE, &entry->flags);
-		/*
-		 * Implicit memory barrier after test_and_set_bit
-		 */
-		if (waitqueue_active(&entry->wait))
-			wake_up(&entry->wait);
+		/* test_and_set_bit implies a barrier */
+		cond_wake_up_nomb(&entry->wait);
 	} else {
 		ret = 1;
 	}
@@ -410,11 +407,8 @@ int btrfs_dec_test_ordered_pending(struct inode *inode,
 
 	if (entry->bytes_left == 0) {
 		ret = test_and_set_bit(BTRFS_ORDERED_IO_DONE, &entry->flags);
-		/*
-		 * Implicit memory barrier after test_and_set_bit
-		 */
-		if (waitqueue_active(&entry->wait))
-			wake_up(&entry->wait);
+		/* test_and_set_bit implies a barrier */
+		cond_wake_up_nomb(&entry->wait);
 	} else {
 		ret = 1;
 	}
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index c944b4769e3c..ff841abb756e 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -877,12 +877,7 @@ static int __btrfs_end_transaction(struct btrfs_trans_handle *trans,
 	atomic_dec(&cur_trans->num_writers);
 	extwriter_counter_dec(cur_trans, trans->type);
 
-	/*
-	 * Make sure counter is updated before we wake up waiters.
-	 */
-	smp_mb();
-	if (waitqueue_active(&cur_trans->writer_wait))
-		wake_up(&cur_trans->writer_wait);
+	cond_wake_up(&cur_trans->writer_wait);
 	btrfs_put_transaction(cur_trans);
 
 	if (current->journal_info == trans)
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index fa5b3dc5f4d5..a3ebda24ae12 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -222,11 +222,8 @@ int btrfs_pin_log_trans(struct btrfs_root *root)
 void btrfs_end_log_trans(struct btrfs_root *root)
 {
 	if (atomic_dec_and_test(&root->log_writers)) {
-		/*
-		 * Implicit memory barrier after atomic_dec_and_test
-		 */
-		if (waitqueue_active(&root->log_writer_wait))
-			wake_up(&root->log_writer_wait);
+		/* atomic_dec_and_test implies a barrier */
+		cond_wake_up_nomb(&root->log_writer_wait);
 	}
 }
 
@@ -2988,11 +2985,8 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
 
 	mutex_lock(&log_root_tree->log_mutex);
 	if (atomic_dec_and_test(&log_root_tree->log_writers)) {
-		/*
-		 * Implicit memory barrier after atomic_dec_and_test
-		 */
-		if (waitqueue_active(&log_root_tree->log_writer_wait))
-			wake_up(&log_root_tree->log_writer_wait);
+		/* atomic_dec_and_test implies a barrier */
+		cond_wake_up_nomb(&log_root_tree->log_writer_wait);
 	}
 
 	if (ret) {
@@ -3116,13 +3110,11 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
 	mutex_unlock(&log_root_tree->log_mutex);
 
 	/*
-	 * The barrier before waitqueue_active is needed so all the updates
-	 * above are seen by the woken threads. It might not be necessary, but
-	 * proving that seems to be hard.
+	 * The barrier before waitqueue_active (in cond_wake_up) is needed so
+	 * all the updates above are seen by the woken threads. It might not be
+	 * necessary, but proving that seems to be hard.
 	 */
-	smp_mb();
-	if (waitqueue_active(&log_root_tree->log_commit_wait[index2]))
-		wake_up(&log_root_tree->log_commit_wait[index2]);
+	cond_wake_up(&log_root_tree->log_commit_wait[index2]);
 out:
 	mutex_lock(&root->log_mutex);
 	btrfs_remove_all_log_ctxs(root, index1, ret);
@@ -3131,13 +3123,11 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
 	mutex_unlock(&root->log_mutex);
 
 	/*
-	 * The barrier before waitqueue_active is needed so all the updates
-	 * above are seen by the woken threads. It might not be necessary, but
-	 * proving that seems to be hard.
+	 * The barrier before waitqueue_active (in cond_wake_up) is needed so
+	 * all the updates above are seen by the woken threads. It might not be
+	 * necessary, but proving that seems to be hard.
 	 */
-	smp_mb();
-	if (waitqueue_active(&root->log_commit_wait[index1]))
-		wake_up(&root->log_commit_wait[index1]);
+	cond_wake_up(&root->log_commit_wait[index1]);
 	return ret;
 }
 
-- 
2.16.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 2/3] btrfs: add barriers to btrfs_sync_log before log_commit_wait wakeups
  2018-05-14 12:23 ` [PATCH v3 2/3] btrfs: add barriers to btrfs_sync_log before log_commit_wait wakeups David Sterba
@ 2018-05-14 14:41   ` Nikolay Borisov
  0 siblings, 0 replies; 6+ messages in thread
From: Nikolay Borisov @ 2018-05-14 14:41 UTC (permalink / raw)
  To: David Sterba, linux-btrfs



On 14.05.2018 15:23, David Sterba wrote:
> Currently the code assumes that there's an implied barrier by the
> sequence of code preceding the wakeup, namely the mutex unlock.
> 
> As Nikolay pointed out:
> 
> I think this is wrong (not your code) but the original assumption that
> the RELEASE semantics provided by mutex_unlock is sufficient.
> According to memory-barriers.txt:
> 
> Section 'LOCK ACQUISITION FUNCTIONS' states:
> 
>  (2) RELEASE operation implication:
> 
>      Memory operations issued before the RELEASE will be completed before the
>      RELEASE operation has completed.
> 
>      Memory operations issued after the RELEASE *may* be completed before the
>      RELEASE operation has completed.
> 
> (I've bolded the may portion)
> 
> The example given there:
> 
> As an example, consider the following:
> 
>     *A = a;
>     *B = b;
>     ACQUIRE
>     *C = c;
>     *D = d;
>     RELEASE
>     *E = e;
>     *F = f;
> 
> The following sequence of events is acceptable:
> 
>     ACQUIRE, {*F,*A}, *E, {*C,*D}, *B, RELEASE
> 
> So if we assume that *C is modifying the flag which the waitqueue is checking,
> and *E is the actual wakeup, then those accesses can be re-ordered...
> 
> IMHO this code should be considered broken...
> ---
> 
> To be on the safe side, add the barriers. The synchronization logic
> around log using the mutexes and several other threads does not make it
> easy to reason for/against the barrier.
> 
> CC: Nikolay Borisov <nborisov@suse.com>
> Link: https://lkml.kernel.org/r/6ee068d8-1a69-3728-00d1-d86293d43c9f@suse.com
> Signed-off-by: David Sterba <dsterba@suse.com>
> ---

Apart from what I said initially which prompted introducing this patch I
can't say anything else. I think the fsync code is in dire need of being
rewritten/simplified.

But in so far as the newly introduced barriers are concerned:

Reviewed-by: Nikolay Borisov <nborisov@suse.com>
>  fs/btrfs/tree-log.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> index 43758e30aa7a..fa5b3dc5f4d5 100644
> --- a/fs/btrfs/tree-log.c
> +++ b/fs/btrfs/tree-log.c
> @@ -3116,8 +3116,11 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
>  	mutex_unlock(&log_root_tree->log_mutex);
>  
>  	/*
> -	 * The barrier before waitqueue_active is implied by mutex_unlock
> +	 * The barrier before waitqueue_active is needed so all the updates
> +	 * above are seen by the woken threads. It might not be necessary, but
> +	 * proving that seems to be hard.
>  	 */
> +	smp_mb();
>  	if (waitqueue_active(&log_root_tree->log_commit_wait[index2]))
>  		wake_up(&log_root_tree->log_commit_wait[index2]);
>  out:
> @@ -3128,8 +3131,11 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
>  	mutex_unlock(&root->log_mutex);
>  
>  	/*
> -	 * The barrier before waitqueue_active is implied by mutex_unlock
> +	 * The barrier before waitqueue_active is needed so all the updates
> +	 * above are seen by the woken threads. It might not be necessary, but
> +	 * proving that seems to be hard.
>  	 */
> +	smp_mb();
>  	if (waitqueue_active(&root->log_commit_wait[index1]))
>  		wake_up(&root->log_commit_wait[index1]);
>  	return ret;
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 3/3] btrfs: replace waitqueue_actvie with cond_wake_up
  2018-05-14 12:23 ` [PATCH v3 3/3] btrfs: replace waitqueue_actvie with cond_wake_up David Sterba
@ 2018-05-14 15:42   ` Nikolay Borisov
  0 siblings, 0 replies; 6+ messages in thread
From: Nikolay Borisov @ 2018-05-14 15:42 UTC (permalink / raw)
  To: David Sterba, linux-btrfs



On 14.05.2018 15:23, David Sterba wrote:
> Use the wrappers and reduce the amount of low-level details about the
> waitqueue management.
> 
> Signed-off-by: David Sterba <dsterba@suse.com>

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

> ---
>  fs/btrfs/compression.c   |  7 +------
>  fs/btrfs/delayed-inode.c |  9 +++------
>  fs/btrfs/dev-replace.c   | 10 ++++------
>  fs/btrfs/extent-tree.c   |  7 +------
>  fs/btrfs/inode.c         |  9 +++------
>  fs/btrfs/locking.c       | 34 +++++++++++-----------------------
>  fs/btrfs/ordered-data.c  | 14 ++++----------
>  fs/btrfs/transaction.c   |  7 +------
>  fs/btrfs/tree-log.c      | 34 ++++++++++++----------------------
>  9 files changed, 40 insertions(+), 91 deletions(-)
> 
> diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
> index 5268c9f85ca7..dcebb91e4e0c 100644
> --- a/fs/btrfs/compression.c
> +++ b/fs/btrfs/compression.c
> @@ -990,12 +990,7 @@ static void __free_workspace(int type, struct list_head *workspace,
>  		btrfs_compress_op[idx]->free_workspace(workspace);
>  	atomic_dec(total_ws);
>  wake:
> -	/*
> -	 * Make sure counter is updated before we wake up waiters.
> -	 */
> -	smp_mb();
> -	if (waitqueue_active(ws_wait))
> -		wake_up(ws_wait);
> +	cond_wake_up(ws_wait);
>  }
>  
>  static void free_workspace(int type, struct list_head *ws)
> diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
> index a8d492dbd3e7..fe6caa7e698b 100644
> --- a/fs/btrfs/delayed-inode.c
> +++ b/fs/btrfs/delayed-inode.c
> @@ -460,13 +460,10 @@ static void finish_one_item(struct btrfs_delayed_root *delayed_root)
>  {
>  	int seq = atomic_inc_return(&delayed_root->items_seq);
>  
> -	/*
> -	 * atomic_dec_return implies a barrier for waitqueue_active
> -	 */
> +	/* atomic_dec_return implies a barrier */
>  	if ((atomic_dec_return(&delayed_root->items) <
> -	    BTRFS_DELAYED_BACKGROUND || seq % BTRFS_DELAYED_BATCH == 0) &&
> -	    waitqueue_active(&delayed_root->wait))
> -		wake_up(&delayed_root->wait);
> +	    BTRFS_DELAYED_BACKGROUND || seq % BTRFS_DELAYED_BATCH == 0))
> +		cond_wake_up_nomb(&delayed_root->wait);
>  }
>  
>  static void __btrfs_remove_delayed_item(struct btrfs_delayed_item *delayed_item)
> diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
> index f82be266ba4b..d33ff5b23302 100644
> --- a/fs/btrfs/dev-replace.c
> +++ b/fs/btrfs/dev-replace.c
> @@ -916,9 +916,9 @@ void btrfs_dev_replace_clear_lock_blocking(
>  	ASSERT(atomic_read(&dev_replace->read_locks) > 0);
>  	ASSERT(atomic_read(&dev_replace->blocking_readers) > 0);
>  	read_lock(&dev_replace->lock);
> -	if (atomic_dec_and_test(&dev_replace->blocking_readers) &&
> -	    waitqueue_active(&dev_replace->read_lock_wq))
> -		wake_up(&dev_replace->read_lock_wq);
> +	/* Barrier implied by atomic_dec_and_test */
> +	if (atomic_dec_and_test(&dev_replace->blocking_readers))
> +		cond_wake_up_nomb(&dev_replace->read_lock_wq);
>  }
>  
>  void btrfs_bio_counter_inc_noblocked(struct btrfs_fs_info *fs_info)
> @@ -929,9 +929,7 @@ void btrfs_bio_counter_inc_noblocked(struct btrfs_fs_info *fs_info)
>  void btrfs_bio_counter_sub(struct btrfs_fs_info *fs_info, s64 amount)
>  {
>  	percpu_counter_sub(&fs_info->bio_counter, amount);
> -
> -	if (waitqueue_active(&fs_info->replace_wait))
> -		wake_up(&fs_info->replace_wait);
> +	cond_wake_up_nomb(&fs_info->replace_wait);
>  }
>  
>  void btrfs_bio_counter_inc_blocked(struct btrfs_fs_info *fs_info)
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 3871658b6ab1..94fc825a126d 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -11094,12 +11094,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range)
>  void btrfs_end_write_no_snapshotting(struct btrfs_root *root)
>  {
>  	percpu_counter_dec(&root->subv_writers->counter);
> -	/*
> -	 * Make sure counter is updated before we wake up waiters.
> -	 */
> -	smp_mb();
> -	if (waitqueue_active(&root->subv_writers->wait))
> -		wake_up(&root->subv_writers->wait);
> +	cond_wake_up(&root->subv_writers->wait);
>  }
>  
>  int btrfs_start_write_no_snapshotting(struct btrfs_root *root)
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index d241285a0d2a..cf6f2815ffee 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -1156,13 +1156,10 @@ static noinline void async_cow_submit(struct btrfs_work *work)
>  	nr_pages = (async_cow->end - async_cow->start + PAGE_SIZE) >>
>  		PAGE_SHIFT;
>  
> -	/*
> -	 * atomic_sub_return implies a barrier for waitqueue_active
> -	 */
> +	/* atomic_sub_return implies a barrier */
>  	if (atomic_sub_return(nr_pages, &fs_info->async_delalloc_pages) <
> -	    5 * SZ_1M &&
> -	    waitqueue_active(&fs_info->async_submit_wait))
> -		wake_up(&fs_info->async_submit_wait);
> +	    5 * SZ_1M)
> +		cond_wake_up_nomb(&fs_info->async_submit_wait);
>  
>  	if (async_cow->inode)
>  		submit_compressed_extents(async_cow->inode, async_cow);
> diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c
> index e4faefac9d16..1da768e5ef75 100644
> --- a/fs/btrfs/locking.c
> +++ b/fs/btrfs/locking.c
> @@ -66,22 +66,16 @@ void btrfs_clear_lock_blocking_rw(struct extent_buffer *eb, int rw)
>  		write_lock(&eb->lock);
>  		WARN_ON(atomic_read(&eb->spinning_writers));
>  		atomic_inc(&eb->spinning_writers);
> -		/*
> -		 * atomic_dec_and_test implies a barrier for waitqueue_active
> -		 */
> -		if (atomic_dec_and_test(&eb->blocking_writers) &&
> -		    waitqueue_active(&eb->write_lock_wq))
> -			wake_up(&eb->write_lock_wq);
> +		/* atomic_dec_and_test implies a barrier */
> +		if (atomic_dec_and_test(&eb->blocking_writers))
> +			cond_wake_up_nomb(&eb->write_lock_wq);
>  	} else if (rw == BTRFS_READ_LOCK_BLOCKING) {
>  		BUG_ON(atomic_read(&eb->blocking_readers) == 0);
>  		read_lock(&eb->lock);
>  		atomic_inc(&eb->spinning_readers);
> -		/*
> -		 * atomic_dec_and_test implies a barrier for waitqueue_active
> -		 */
> -		if (atomic_dec_and_test(&eb->blocking_readers) &&
> -		    waitqueue_active(&eb->read_lock_wq))
> -			wake_up(&eb->read_lock_wq);
> +		/* atomic_dec_and_test implies a barrier */
> +		if (atomic_dec_and_test(&eb->blocking_readers))
> +			cond_wake_up_nomb(&eb->read_lock_wq);
>  	}
>  }
>  
> @@ -221,12 +215,9 @@ void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb)
>  	}
>  	btrfs_assert_tree_read_locked(eb);
>  	WARN_ON(atomic_read(&eb->blocking_readers) == 0);
> -	/*
> -	 * atomic_dec_and_test implies a barrier for waitqueue_active
> -	 */
> -	if (atomic_dec_and_test(&eb->blocking_readers) &&
> -	    waitqueue_active(&eb->read_lock_wq))
> -		wake_up(&eb->read_lock_wq);
> +	/* atomic_dec_and_test implies a barrier */
> +	if (atomic_dec_and_test(&eb->blocking_readers))
> +		cond_wake_up_nomb(&eb->read_lock_wq);
>  	atomic_dec(&eb->read_locks);
>  }
>  
> @@ -275,12 +266,9 @@ void btrfs_tree_unlock(struct extent_buffer *eb)
>  	if (blockers) {
>  		WARN_ON(atomic_read(&eb->spinning_writers));
>  		atomic_dec(&eb->blocking_writers);
> -		/*
> -		 * Make sure counter is updated before we wake up waiters.
> -		 */
> +		/* Use the lighter barrier after atomic */
>  		smp_mb__after_atomic();
> -		if (waitqueue_active(&eb->write_lock_wq))
> -			wake_up(&eb->write_lock_wq);
> +		cond_wake_up_nomb(&eb->write_lock_wq);
>  	} else {
>  		WARN_ON(atomic_read(&eb->spinning_writers) != 1);
>  		atomic_dec(&eb->spinning_writers);
> diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
> index 6db8bb2f2c28..2e1a1694a33d 100644
> --- a/fs/btrfs/ordered-data.c
> +++ b/fs/btrfs/ordered-data.c
> @@ -343,11 +343,8 @@ int btrfs_dec_test_first_ordered_pending(struct inode *inode,
>  
>  	if (entry->bytes_left == 0) {
>  		ret = test_and_set_bit(BTRFS_ORDERED_IO_DONE, &entry->flags);
> -		/*
> -		 * Implicit memory barrier after test_and_set_bit
> -		 */
> -		if (waitqueue_active(&entry->wait))
> -			wake_up(&entry->wait);
> +		/* test_and_set_bit implies a barrier */
> +		cond_wake_up_nomb(&entry->wait);
>  	} else {
>  		ret = 1;
>  	}
> @@ -410,11 +407,8 @@ int btrfs_dec_test_ordered_pending(struct inode *inode,
>  
>  	if (entry->bytes_left == 0) {
>  		ret = test_and_set_bit(BTRFS_ORDERED_IO_DONE, &entry->flags);
> -		/*
> -		 * Implicit memory barrier after test_and_set_bit
> -		 */
> -		if (waitqueue_active(&entry->wait))
> -			wake_up(&entry->wait);
> +		/* test_and_set_bit implies a barrier */
> +		cond_wake_up_nomb(&entry->wait);
>  	} else {
>  		ret = 1;
>  	}
> diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
> index c944b4769e3c..ff841abb756e 100644
> --- a/fs/btrfs/transaction.c
> +++ b/fs/btrfs/transaction.c
> @@ -877,12 +877,7 @@ static int __btrfs_end_transaction(struct btrfs_trans_handle *trans,
>  	atomic_dec(&cur_trans->num_writers);
>  	extwriter_counter_dec(cur_trans, trans->type);
>  
> -	/*
> -	 * Make sure counter is updated before we wake up waiters.
> -	 */
> -	smp_mb();
> -	if (waitqueue_active(&cur_trans->writer_wait))
> -		wake_up(&cur_trans->writer_wait);
> +	cond_wake_up(&cur_trans->writer_wait);
>  	btrfs_put_transaction(cur_trans);
>  
>  	if (current->journal_info == trans)
> diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
> index fa5b3dc5f4d5..a3ebda24ae12 100644
> --- a/fs/btrfs/tree-log.c
> +++ b/fs/btrfs/tree-log.c
> @@ -222,11 +222,8 @@ int btrfs_pin_log_trans(struct btrfs_root *root)
>  void btrfs_end_log_trans(struct btrfs_root *root)
>  {
>  	if (atomic_dec_and_test(&root->log_writers)) {
> -		/*
> -		 * Implicit memory barrier after atomic_dec_and_test
> -		 */
> -		if (waitqueue_active(&root->log_writer_wait))
> -			wake_up(&root->log_writer_wait);
> +		/* atomic_dec_and_test implies a barrier */
> +		cond_wake_up_nomb(&root->log_writer_wait);
>  	}
>  }
>  
> @@ -2988,11 +2985,8 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
>  
>  	mutex_lock(&log_root_tree->log_mutex);
>  	if (atomic_dec_and_test(&log_root_tree->log_writers)) {
> -		/*
> -		 * Implicit memory barrier after atomic_dec_and_test
> -		 */
> -		if (waitqueue_active(&log_root_tree->log_writer_wait))
> -			wake_up(&log_root_tree->log_writer_wait);
> +		/* atomic_dec_and_test implies a barrier */
> +		cond_wake_up_nomb(&log_root_tree->log_writer_wait);
>  	}
>  
>  	if (ret) {
> @@ -3116,13 +3110,11 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
>  	mutex_unlock(&log_root_tree->log_mutex);
>  
>  	/*
> -	 * The barrier before waitqueue_active is needed so all the updates
> -	 * above are seen by the woken threads. It might not be necessary, but
> -	 * proving that seems to be hard.
> +	 * The barrier before waitqueue_active (in cond_wake_up) is needed so
> +	 * all the updates above are seen by the woken threads. It might not be
> +	 * necessary, but proving that seems to be hard.
>  	 */
> -	smp_mb();
> -	if (waitqueue_active(&log_root_tree->log_commit_wait[index2]))
> -		wake_up(&log_root_tree->log_commit_wait[index2]);
> +	cond_wake_up(&log_root_tree->log_commit_wait[index2]);
>  out:
>  	mutex_lock(&root->log_mutex);
>  	btrfs_remove_all_log_ctxs(root, index1, ret);
> @@ -3131,13 +3123,11 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
>  	mutex_unlock(&root->log_mutex);
>  
>  	/*
> -	 * The barrier before waitqueue_active is needed so all the updates
> -	 * above are seen by the woken threads. It might not be necessary, but
> -	 * proving that seems to be hard.
> +	 * The barrier before waitqueue_active (in cond_wake_up) is needed so
> +	 * all the updates above are seen by the woken threads. It might not be
> +	 * necessary, but proving that seems to be hard.
>  	 */
> -	smp_mb();
> -	if (waitqueue_active(&root->log_commit_wait[index1]))
> -		wake_up(&root->log_commit_wait[index1]);
> +	cond_wake_up(&root->log_commit_wait[index1]);
>  	return ret;
>  }
>  
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-05-14 15:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-14 12:23 [PATCH v3 0/3] Cleanup waitqueue_active and barriers David Sterba
2018-05-14 12:23 ` [PATCH v3 1/3] btrfs: introduce conditional wakeup helpers David Sterba
2018-05-14 12:23 ` [PATCH v3 2/3] btrfs: add barriers to btrfs_sync_log before log_commit_wait wakeups David Sterba
2018-05-14 14:41   ` Nikolay Borisov
2018-05-14 12:23 ` [PATCH v3 3/3] btrfs: replace waitqueue_actvie with cond_wake_up David Sterba
2018-05-14 15:42   ` Nikolay Borisov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.