linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: David Sterba <dsterba@suse.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 5/5] btrfs: document extent buffer locking
Date: Fri, 18 Oct 2019 08:17:44 +0800	[thread overview]
Message-ID: <01b47547-44fc-966f-fae2-3b7138f40adc@gmx.com> (raw)
In-Reply-To: <ed989ccddbc8822e61df533d861d907ce0a43040.1571340084.git.dsterba@suse.com>


[-- Attachment #1.1: Type: text/plain, Size: 7992 bytes --]



On 2019/10/18 上午3:39, David Sterba wrote:
> Signed-off-by: David Sterba <dsterba@suse.com>

Great document.

Some questions inlined below.
> ---
>  fs/btrfs/locking.c | 110 +++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 96 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c
> index e0e0430577aa..2a0e828b4470 100644
> --- a/fs/btrfs/locking.c
> +++ b/fs/btrfs/locking.c
> @@ -13,6 +13,48 @@
>  #include "extent_io.h"
>  #include "locking.h"
>  
> +/*
> + * Extent buffer locking
> + * ~~~~~~~~~~~~~~~~~~~~~
> + *
> + * The locks use a custom scheme that allows to do more operations than are
> + * available fromt current locking primitives. The building blocks are still
> + * rwlock and wait queues.
> + *
> + * Required semantics:
> + *
> + * - reader/writer exclusion
> + * - writer/writer exclusion
> + * - reader/reader sharing
> + * - spinning lock semantics
> + * - blocking lock semantics
> + * - try-lock semantics for readers and writers
> + * - one level nesting, allowing read lock to be taken by the same thread that
> + *   already has write lock

Any example about this scenario? IIRC there is only one user of nested lock.
Although we know it exists for a long time, I guess it would be better
trying to remove such call sites?

> + *
> + * The extent buffer locks (also called tree locks) manage access to eb data.

One of my concern related to "access to eb data" is, to access some
member, we don't really need any lock at all.

Some members should never change during the lifespan of an eb. E.g.
bytenr, transid.

Some code is already taking advantage of this, like tree-checker
checking the transid without holding a lock.
Not sure if we should take use of this.

> + * We want concurrency of many readers and safe updates. The underlying locking
> + * is done by read-write spinlock and the blocking part is implemented using
> + * counters and wait queues.
> + *
> + * spinning semantics - the low-level rwlock is held so all other threads that
> + *                      want to take it are spinning on it.
> + *
> + * blocking semantics - the low-level rwlock is not held but the counter
> + *                      denotes how many times the blocking lock was held;
> + *                      sleeping is possible

What about an example/state machine of all read/write and
spinning/blocking combination?

Thanks,
Qu

> + *
> + * Write lock always allows only one thread to access the data.
> + *
> + *
> + * Debugging
> + * ~~~~~~~~~
> + *
> + * There are additional state counters that are asserted in various contexts,
> + * removed from non-debug build to reduce extent_buffer size and for
> + * performance reasons.
> + */
> +
>  #ifdef CONFIG_BTRFS_DEBUG
>  static inline void btrfs_assert_spinning_writers_get(struct extent_buffer *eb)
>  {
> @@ -80,6 +122,15 @@ static void btrfs_assert_tree_write_locks_get(struct extent_buffer *eb) { }
>  static void btrfs_assert_tree_write_locks_put(struct extent_buffer *eb) { }
>  #endif
>  
> +/*
> + * Mark already held read lock as blocking. Can be nested in write lock by the
> + * same thread.
> + *
> + * Use when there are potentially long operations ahead so other thread waiting
> + * on the lock will not actively spin but sleep instead.
> + *
> + * The rwlock is released and blocking reader counter is increased.
> + */
>  void btrfs_set_lock_blocking_read(struct extent_buffer *eb)
>  {
>  	trace_btrfs_set_lock_blocking_read(eb);
> @@ -96,6 +147,14 @@ void btrfs_set_lock_blocking_read(struct extent_buffer *eb)
>  	read_unlock(&eb->lock);
>  }
>  
> +/*
> + * Mark already held write lock as blocking.
> + *
> + * Use when there are potentially long operations ahead so other threads
> + * waiting on the lock will not actively spin but sleep instead.
> + *
> + * The rwlock is released and blocking writers is set.
> + */
>  void btrfs_set_lock_blocking_write(struct extent_buffer *eb)
>  {
>  	trace_btrfs_set_lock_blocking_write(eb);
> @@ -127,8 +186,13 @@ void btrfs_set_lock_blocking_write(struct extent_buffer *eb)
>  }
>  
>  /*
> - * take a spinning read lock.  This will wait for any blocking
> - * writers
> + * Lock the extent buffer for read. Wait for any writers (spinning or blocking).
> + * Can be nested in write lock by the same thread.
> + *
> + * Use when the locked section does only lightweight actions and busy waiting
> + * would be cheaper than making other threads do the wait/wake loop.
> + *
> + * The rwlock is held upon exit.
>   */
>  void btrfs_tree_read_lock(struct extent_buffer *eb)
>  {
> @@ -166,9 +230,10 @@ void btrfs_tree_read_lock(struct extent_buffer *eb)
>  }
>  
>  /*
> - * take a spinning read lock.
> - * returns 1 if we get the read lock and 0 if we don't
> - * this won't wait for blocking writers
> + * Lock extent buffer for read, optimistically expecting that there are no
> + * contending blocking writers. If there are, don't wait.
> + *
> + * Return 1 if the rwlock has been taken, 0 otherwise
>   */
>  int btrfs_tree_read_lock_atomic(struct extent_buffer *eb)
>  {
> @@ -188,8 +253,9 @@ int btrfs_tree_read_lock_atomic(struct extent_buffer *eb)
>  }
>  
>  /*
> - * returns 1 if we get the read lock and 0 if we don't
> - * this won't wait for blocking writers
> + * Try-lock for read. Don't block or wait for contending writers.
> + *
> + * Retrun 1 if the rwlock has been taken, 0 otherwise
>   */
>  int btrfs_try_tree_read_lock(struct extent_buffer *eb)
>  {
> @@ -211,8 +277,10 @@ int btrfs_try_tree_read_lock(struct extent_buffer *eb)
>  }
>  
>  /*
> - * returns 1 if we get the read lock and 0 if we don't
> - * this won't wait for blocking writers or readers
> + * Try-lock for write. May block until the lock is uncontended, but does not
> + * wait until it is free.
> + *
> + * Retrun 1 if the rwlock has been taken, 0 otherwise
>   */
>  int btrfs_try_tree_write_lock(struct extent_buffer *eb)
>  {
> @@ -233,7 +301,10 @@ int btrfs_try_tree_write_lock(struct extent_buffer *eb)
>  }
>  
>  /*
> - * drop a spinning read lock
> + * Release read lock. Must be used only if the lock is in spinning mode.  If
> + * the read lock is nested, must pair with read lock before the write unlock.
> + *
> + * The rwlock is not held upon exit.
>   */
>  void btrfs_tree_read_unlock(struct extent_buffer *eb)
>  {
> @@ -255,7 +326,11 @@ void btrfs_tree_read_unlock(struct extent_buffer *eb)
>  }
>  
>  /*
> - * drop a blocking read lock
> + * Release read lock, previously set to blocking by a pairing call to
> + * btrfs_set_lock_blocking_read(). Can be nested in write lock by the same
> + * thread.
> + *
> + * State of rwlock is unchanged, last reader wakes waiting threads.
>   */
>  void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb)
>  {
> @@ -279,8 +354,10 @@ void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb)
>  }
>  
>  /*
> - * take a spinning write lock.  This will wait for both
> - * blocking readers or writers
> + * Lock for write. Wait for all blocking and spinning readers and writers. This
> + * starts context where reader lock could be nested by the same thread.
> + *
> + * The rwlock is held for write upon exit.
>   */
>  void btrfs_tree_lock(struct extent_buffer *eb)
>  {
> @@ -307,7 +384,12 @@ void btrfs_tree_lock(struct extent_buffer *eb)
>  }
>  
>  /*
> - * drop a spinning or a blocking write lock.
> + * Release the write lock, either blocking or spinning (ie. there's no need
> + * for an explicit blocking unlock, like btrfs_tree_read_unlock_blocking).
> + * This also ends the context for nesting, the read lock must have been
> + * released already.
> + *
> + * Tasks blocked and waiting are woken, rwlock is not held upon exit.
>   */
>  void btrfs_tree_unlock(struct extent_buffer *eb)
>  {
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 520 bytes --]

  reply	other threads:[~2019-10-18  0:18 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-17 19:38 [PATCH 0/5] Extent buffer locking and documentation David Sterba
2019-10-17 19:38 ` [PATCH 1/5] btrfs: merge blocking_writers branches in btrfs_tree_read_lock David Sterba
2019-10-17 19:38 ` [PATCH 2/5] btrfs: set blocking_writers directly, no increment or decrement David Sterba
2019-10-18 12:08   ` Nikolay Borisov
2019-10-18 17:31     ` David Sterba
2019-10-17 19:38 ` [PATCH 3/5] btrfs: access eb::blocking_writers according to ACCESS_ONCE policies David Sterba
2019-10-23  8:42   ` Nikolay Borisov
2019-10-29 21:33     ` David Sterba
2019-10-17 19:39 ` [PATCH 4/5] btrfs: serialize blocking_writers updates David Sterba
2019-10-23  9:57   ` Nikolay Borisov
2019-10-29 17:51     ` David Sterba
2019-10-29 18:48       ` Nikolay Borisov
2019-10-29 21:15         ` David Sterba
2019-10-17 19:39 ` [PATCH 5/5] btrfs: document extent buffer locking David Sterba
2019-10-18  0:17   ` Qu Wenruo [this message]
2019-10-18 11:56     ` David Sterba
2019-10-22  9:53   ` Nikolay Borisov
2019-10-22 10:41     ` David Sterba
2019-10-23 11:16   ` Nikolay Borisov
2019-10-29 17:33     ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=01b47547-44fc-966f-fae2-3b7138f40adc@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=dsterba@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).