All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Sterba <dsterba@suse.cz>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2] btrfs: trace: Introduce trace events for sleepable tree lock
Date: Wed, 3 Apr 2019 17:43:54 +0200	[thread overview]
Message-ID: <20190403154354.GR29086@twin.jikos.cz> (raw)
In-Reply-To: <20190402094554.8353-1-wqu@suse.com>

On Tue, Apr 02, 2019 at 05:45:54PM +0800, Qu Wenruo wrote:
> There are two tree lock events which can sleep:
> - btrfs_tree_read_lock()
> - btrfs_tree_lock()
> 
> Sometimes we may need to look into the concurrency picture of the fs.
> For that case, we need the execution time of above two functions and the
> owner of @eb.
> 
> Here we introduce a trace events for user space tools like bcc, to get
> the execution time of above two functions, and get detailed owner info
> where eBPF code can't.
> 
> All the overhead is hidden behind the trace events, so if events are not
> enabled, there is no overhead.
> 
> Also, since this patch and later user space tool only cares about the
> execution time and owner, other info like bytenr is ignored in this
> events.

The time is definetelly the interesting part here, I'm thinking if the
block number could also provide some useful hints. The gathered trace
data can be used to group by blocks to compare the times, or see the
histogram of times.

The block at minimum would be useful and I don't want to add more
information there, like level or nritems. In case we do want that
information, a heavier tracepoint version should be added with all the
other data.

> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> Changelog:
> v2:
> - Hide all ktime_get_ns() call behind trace events
>   So there will be no overhead if trace events are not enabled.
> ---
>  fs/btrfs/locking.c           | 12 ++++++++++++
>  include/trace/events/btrfs.h | 36 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 48 insertions(+)
> 
> diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c
> index 6df03ba36026..67b77f1d113e 100644
> --- a/fs/btrfs/locking.c
> +++ b/fs/btrfs/locking.c
> @@ -158,6 +158,10 @@ void btrfs_clear_lock_blocking_write(struct extent_buffer *eb)
>   */
>  void btrfs_tree_read_lock(struct extent_buffer *eb)
>  {
> +	u64 start_ns = 0;
> +
> +	if (trace_btrfs_tree_read_lock_enabled())
> +		start_ns = ktime_get_ns();
>  again:
>  	BUG_ON(!atomic_read(&eb->blocking_writers) &&
>  	       current->pid == eb->lock_owner);
> @@ -174,6 +178,7 @@ void btrfs_tree_read_lock(struct extent_buffer *eb)
>  		BUG_ON(eb->lock_nested);
>  		eb->lock_nested = true;
>  		read_unlock(&eb->lock);
> +		trace_btrfs_tree_read_lock(eb, start_ns);
>  		return;
>  	}
>  	if (atomic_read(&eb->blocking_writers)) {
> @@ -184,6 +189,7 @@ void btrfs_tree_read_lock(struct extent_buffer *eb)
>  	}
>  	btrfs_assert_tree_read_locks_get(eb);
>  	btrfs_assert_spinning_readers_get(eb);
> +	trace_btrfs_tree_read_lock(eb, start_ns);
>  }
>  
>  /*
> @@ -299,6 +305,11 @@ void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb)
>   */
>  void btrfs_tree_lock(struct extent_buffer *eb)
>  {
> +	u64 start_ns = 0;
> +
> +	if (trace_btrfs_tree_lock_enabled())
> +		start_ns = ktime_get_ns();
> +
>  	WARN_ON(eb->lock_owner == current->pid);
>  again:
>  	wait_event(eb->read_lock_wq, atomic_read(&eb->blocking_readers) == 0);
> @@ -312,6 +323,7 @@ void btrfs_tree_lock(struct extent_buffer *eb)
>  	btrfs_assert_spinning_writers_get(eb);
>  	btrfs_assert_tree_write_locks_get(eb);
>  	eb->lock_owner = current->pid;
> +	trace_btrfs_tree_lock(eb, start_ns);
>  }
>  
>  /*
> diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h
> index 8b12753fee78..406a71571254 100644
> --- a/include/trace/events/btrfs.h
> +++ b/include/trace/events/btrfs.h
> @@ -2005,6 +2005,42 @@ TRACE_EVENT(btrfs_convert_extent_bit,
>  		  __print_flags(__entry->clear_bits, "|", EXTENT_FLAGS))
>  );
>  
> +DECLARE_EVENT_CLASS(btrfs_sleep_tree_lock,
> +	TP_PROTO(const struct extent_buffer *eb, u64 start_ns),
> +
> +	TP_ARGS(eb, start_ns),
> +
> +	TP_STRUCT__entry_btrfs(
> +		__field(	u64,	start_ns	)
> +		__field(	u64,	end_ns		)
> +		__field(	u64,	owner		)
> +		__field(	int,	is_log_tree	)
> +	),
> +
> +	TP_fast_assign_btrfs(eb->fs_info,
> +		__entry->start_ns	= start_ns;
> +		__entry->end_ns		= ktime_get_ns();
> +		__entry->owner		= btrfs_header_owner(eb);
> +		__entry->is_log_tree	= (eb->log_index >= 0);
> +	),
> +
> +	TP_printk_btrfs("start_ns=%llu end_ns=%llu owner=%llu is_log_tree=%d",
> +		__entry->start_ns, __entry->end_ns, __entry->owner,
> +		__entry->is_log_tree)

I think the delta between start_ns and end_ns should be printed too, so
it's obvious without further processing:

	TP_printk_btrfs("start_ns=%llu end_ns=%llu diff=%llu owner=%llu is_log_tree=%d",

  reply	other threads:[~2019-04-03 15:42 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-02  9:45 [PATCH v2] btrfs: trace: Introduce trace events for sleepable tree lock Qu Wenruo
2019-04-03 15:43 ` David Sterba [this message]
2019-04-04  0:56   ` Qu Wenruo
2019-04-08 23:38     ` David Sterba
2019-04-09 13:15       ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190403154354.GR29086@twin.jikos.cz \
    --to=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.