linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/2] btrfs: trace: Trace events for btrfs tree locking
@ 2019-04-15 13:15 Qu Wenruo
  2019-04-15 13:15 ` [PATCH v3 1/2] btrfs: trace: Introduce trace events for sleepable tree lock Qu Wenruo
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Qu Wenruo @ 2019-04-15 13:15 UTC (permalink / raw)
  To: linux-btrfs

This branch can be fetched from github:
https://github.com/adam900710/linux/tree/trace_locking
Which is based on v5.1-rc4 tag.

This patchset will introduce the following trace events for corresponding
functions:
  btrfs_tree_lock			(sleepable)
  btrfs_tree_unlock
  btrfs_tree_read_lock			(sleepable)
  btrfs_tree_read_unlock
  btrfs_tree_read_unlock_blocking
  btrfs_set_lock_blocking_read
  btrfs_set_lock_blocking_write
  btrfs_clear_lock_blocking_read
  btrfs_clear_lock_blocking_write
  btrfs_try_tree_read_lock
  btrfs_try_tree_write_lock
  btrfs_tree_read_lock_atomic

The two sleepable ones will have extra info like start_ns, end_ns,
diff_ns.
Those two can be specially useful for tree lock wait time.
There is tool to utilize those two events already:
https://github.com/adam900710/btrfs-profiler/blob/master/tree_lock_wait.py

The non-sleepable can be useful for possible user space based lock
leakage/dead lock detector.

Changelog:
v2:
- Hide all ktime_get_ns() call out of trace events
  So there will be no overhead if trace events are not enabled.

v3:
- More output for human to take a glance
- More trace events for later bcc usage
  Like lock leakage and deadlock detection.

Qu Wenruo (2):
  btrfs: trace: Introduce trace events for sleepable tree lock
  btrfs: trace: Introduce trace events for all btrfs tree locking events

 fs/btrfs/locking.c           | 22 +++++++++
 include/trace/events/btrfs.h | 86 ++++++++++++++++++++++++++++++++++++
 2 files changed, 108 insertions(+)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v3 1/2] btrfs: trace: Introduce trace events for sleepable tree lock
  2019-04-15 13:15 [PATCH v3 0/2] btrfs: trace: Trace events for btrfs tree locking Qu Wenruo
@ 2019-04-15 13:15 ` Qu Wenruo
  2019-04-15 13:15 ` [PATCH v3 2/2] btrfs: trace: Introduce trace events for all btrfs tree locking events Qu Wenruo
  2019-04-15 14:28 ` [PATCH v3 0/2] btrfs: trace: Trace events for btrfs tree locking David Sterba
  2 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2019-04-15 13:15 UTC (permalink / raw)
  To: linux-btrfs

There are two tree lock events which can sleep:
- btrfs_tree_read_lock()
- btrfs_tree_lock()

Sometimes we may need to look into the concurrency picture of the fs.
For that case, we need the execution time of above two functions and the
owner of @eb.

Here we introduce a trace events for user space tools like bcc, to get
the execution time of above two functions, and get detailed owner info
where eBPF code can't.

All the overhead is hidden behind the trace events, so if events are not
enabled, there is no overhead.

These trace events also output bytenr and generation, allow them to be
pared with unlock events to pin down deadlock.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/locking.c           | 12 ++++++++++
 include/trace/events/btrfs.h | 44 ++++++++++++++++++++++++++++++++++++
 2 files changed, 56 insertions(+)

diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c
index 82b84e4daad1..b414d64172b2 100644
--- a/fs/btrfs/locking.c
+++ b/fs/btrfs/locking.c
@@ -90,6 +90,10 @@ void btrfs_clear_lock_blocking_write(struct extent_buffer *eb)
  */
 void btrfs_tree_read_lock(struct extent_buffer *eb)
 {
+	u64 start_ns = 0;
+
+	if (trace_btrfs_tree_read_lock_enabled())
+		start_ns = ktime_get_ns();
 again:
 	BUG_ON(!atomic_read(&eb->blocking_writers) &&
 	       current->pid == eb->lock_owner);
@@ -106,6 +110,7 @@ void btrfs_tree_read_lock(struct extent_buffer *eb)
 		BUG_ON(eb->lock_nested);
 		eb->lock_nested = 1;
 		read_unlock(&eb->lock);
+		trace_btrfs_tree_read_lock(eb, start_ns);
 		return;
 	}
 	if (atomic_read(&eb->blocking_writers)) {
@@ -116,6 +121,7 @@ void btrfs_tree_read_lock(struct extent_buffer *eb)
 	}
 	atomic_inc(&eb->read_locks);
 	atomic_inc(&eb->spinning_readers);
+	trace_btrfs_tree_read_lock(eb, start_ns);
 }
 
 /*
@@ -232,6 +238,11 @@ void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb)
  */
 void btrfs_tree_lock(struct extent_buffer *eb)
 {
+	u64 start_ns = 0;
+
+	if (trace_btrfs_tree_lock_enabled())
+		start_ns = ktime_get_ns();
+
 	WARN_ON(eb->lock_owner == current->pid);
 again:
 	wait_event(eb->read_lock_wq, atomic_read(&eb->blocking_readers) == 0);
@@ -246,6 +257,7 @@ void btrfs_tree_lock(struct extent_buffer *eb)
 	atomic_inc(&eb->spinning_writers);
 	atomic_inc(&eb->write_locks);
 	eb->lock_owner = current->pid;
+	trace_btrfs_tree_lock(eb, start_ns);
 }
 
 /*
diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h
index ab1cc33adbac..389f4722ca02 100644
--- a/include/trace/events/btrfs.h
+++ b/include/trace/events/btrfs.h
@@ -1850,6 +1850,50 @@ DEFINE_EVENT(btrfs__block_group, btrfs_skip_unused_block_group,
 	TP_ARGS(bg_cache)
 );
 
+DECLARE_EVENT_CLASS(btrfs_sleep_tree_lock,
+	TP_PROTO(const struct extent_buffer *eb, u64 start_ns),
+
+	TP_ARGS(eb, start_ns),
+
+	TP_STRUCT__entry_btrfs(
+		__field(	u64,	block		)
+		__field(	u64,	generation	)
+		__field(	u64,	start_ns	)
+		__field(	u64,	end_ns		)
+		__field(	u64,	diff_ns		)
+		__field(	u64,	owner		)
+		__field(	int,	is_log_tree	)
+	),
+
+	TP_fast_assign_btrfs(eb->fs_info,
+		__entry->block		= eb->start;
+		__entry->generation	= btrfs_header_generation(eb);
+		__entry->start_ns	= start_ns;
+		__entry->end_ns		= ktime_get_ns();
+		__entry->diff_ns	= __entry->end_ns - start_ns;
+		__entry->owner		= btrfs_header_owner(eb);
+		__entry->is_log_tree	= (eb->log_index >= 0);
+	),
+
+	TP_printk_btrfs(
+"block=%llu generation=%llu start_ns=%llu end_ns=%llu diff_ns=%llu owner=%llu is_log_tree=%d",
+		__entry->block, __entry->generation,
+		__entry->start_ns, __entry->end_ns, __entry->diff_ns,
+		__entry->owner, __entry->is_log_tree)
+);
+
+DEFINE_EVENT(btrfs_sleep_tree_lock, btrfs_tree_read_lock,
+	TP_PROTO(const struct extent_buffer *eb, u64 start_ns),
+
+	TP_ARGS(eb, start_ns)
+);
+
+DEFINE_EVENT(btrfs_sleep_tree_lock, btrfs_tree_lock,
+	TP_PROTO(const struct extent_buffer *eb, u64 start_ns),
+
+	TP_ARGS(eb, start_ns)
+);
+
 #endif /* _TRACE_BTRFS_H */
 
 /* This part must be outside protection */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v3 2/2] btrfs: trace: Introduce trace events for all btrfs tree locking events
  2019-04-15 13:15 [PATCH v3 0/2] btrfs: trace: Trace events for btrfs tree locking Qu Wenruo
  2019-04-15 13:15 ` [PATCH v3 1/2] btrfs: trace: Introduce trace events for sleepable tree lock Qu Wenruo
@ 2019-04-15 13:15 ` Qu Wenruo
  2019-04-15 14:28 ` [PATCH v3 0/2] btrfs: trace: Trace events for btrfs tree locking David Sterba
  2 siblings, 0 replies; 4+ messages in thread
From: Qu Wenruo @ 2019-04-15 13:15 UTC (permalink / raw)
  To: linux-btrfs

Unlike btrfs_tree_lock() and btrfs_tree_read_lock(), the remaining
functions in locking.c will not sleep, thus doesn't make much sense to
record their execution time.

Those events are introduced mainly for user space tool to audit and
detect lock leakage or dead lock.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/locking.c           | 10 +++++++++
 include/trace/events/btrfs.h | 42 ++++++++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+)

diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c
index b414d64172b2..a09823474d9c 100644
--- a/fs/btrfs/locking.c
+++ b/fs/btrfs/locking.c
@@ -16,6 +16,7 @@ static void btrfs_assert_tree_read_locked(struct extent_buffer *eb);
 
 void btrfs_set_lock_blocking_read(struct extent_buffer *eb)
 {
+	trace_btrfs_set_lock_blocking_read(eb);
 	/*
 	 * No lock is required.  The lock owner may change if we have a read
 	 * lock, but it won't change to or away from us.  If we have the write
@@ -32,6 +33,7 @@ void btrfs_set_lock_blocking_read(struct extent_buffer *eb)
 
 void btrfs_set_lock_blocking_write(struct extent_buffer *eb)
 {
+	trace_btrfs_set_lock_blocking_write(eb);
 	/*
 	 * No lock is required.  The lock owner may change if we have a read
 	 * lock, but it won't change to or away from us.  If we have the write
@@ -50,6 +52,7 @@ void btrfs_set_lock_blocking_write(struct extent_buffer *eb)
 
 void btrfs_clear_lock_blocking_read(struct extent_buffer *eb)
 {
+	trace_btrfs_clear_lock_blocking_read(eb);
 	/*
 	 * No lock is required.  The lock owner may change if we have a read
 	 * lock, but it won't change to or away from us.  If we have the write
@@ -67,6 +70,7 @@ void btrfs_clear_lock_blocking_read(struct extent_buffer *eb)
 
 void btrfs_clear_lock_blocking_write(struct extent_buffer *eb)
 {
+	trace_btrfs_clear_lock_blocking_write(eb);
 	/*
 	 * no lock is required.  The lock owner may change if
 	 * we have a read lock, but it won't change to or away
@@ -141,6 +145,7 @@ int btrfs_tree_read_lock_atomic(struct extent_buffer *eb)
 	}
 	atomic_inc(&eb->read_locks);
 	atomic_inc(&eb->spinning_readers);
+	trace_btrfs_tree_read_lock_atomic(eb);
 	return 1;
 }
 
@@ -162,6 +167,7 @@ int btrfs_try_tree_read_lock(struct extent_buffer *eb)
 	}
 	atomic_inc(&eb->read_locks);
 	atomic_inc(&eb->spinning_readers);
+	trace_btrfs_try_tree_read_lock(eb);
 	return 1;
 }
 
@@ -184,6 +190,7 @@ int btrfs_try_tree_write_lock(struct extent_buffer *eb)
 	atomic_inc(&eb->write_locks);
 	atomic_inc(&eb->spinning_writers);
 	eb->lock_owner = current->pid;
+	trace_btrfs_try_tree_write_lock(eb);
 	return 1;
 }
 
@@ -192,6 +199,7 @@ int btrfs_try_tree_write_lock(struct extent_buffer *eb)
  */
 void btrfs_tree_read_unlock(struct extent_buffer *eb)
 {
+	trace_btrfs_tree_read_unlock(eb);
 	/*
 	 * if we're nested, we have the write lock.  No new locking
 	 * is needed as long as we are the lock owner.
@@ -214,6 +222,7 @@ void btrfs_tree_read_unlock(struct extent_buffer *eb)
  */
 void btrfs_tree_read_unlock_blocking(struct extent_buffer *eb)
 {
+	trace_btrfs_tree_read_unlock_blocking(eb);
 	/*
 	 * if we're nested, we have the write lock.  No new locking
 	 * is needed as long as we are the lock owner.
@@ -270,6 +279,7 @@ void btrfs_tree_unlock(struct extent_buffer *eb)
 	BUG_ON(blockers > 1);
 
 	btrfs_assert_tree_locked(eb);
+	trace_btrfs_tree_unlock(eb);
 	eb->lock_owner = 0;
 	atomic_dec(&eb->write_locks);
 
diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h
index 389f4722ca02..052263bd071c 100644
--- a/include/trace/events/btrfs.h
+++ b/include/trace/events/btrfs.h
@@ -1894,6 +1894,48 @@ DEFINE_EVENT(btrfs_sleep_tree_lock, btrfs_tree_lock,
 	TP_ARGS(eb, start_ns)
 );
 
+DECLARE_EVENT_CLASS(btrfs_locking_events,
+	TP_PROTO(const struct extent_buffer *eb),
+
+	TP_ARGS(eb),
+
+	TP_STRUCT__entry_btrfs(
+		__field(	u64,	block		)
+		__field(	u64,	generation	)
+		__field(	u64,	owner		)
+		__field(	int,	is_log_tree	)
+	),
+
+	TP_fast_assign_btrfs(eb->fs_info,
+		__entry->block		= eb->start;
+		__entry->generation	= btrfs_header_generation(eb);
+		__entry->owner		= btrfs_header_owner(eb);
+		__entry->is_log_tree	= (eb->log_index >= 0);
+	),
+
+	TP_printk_btrfs("block=%llu generation=%llu owner=%llu is_log_tree=%d",
+		__entry->block, __entry->generation,
+		__entry->owner, __entry->is_log_tree)
+);
+
+#define DEFINE_BTRFS_LOCK_EVENT(name)				\
+DEFINE_EVENT(btrfs_locking_events, name,			\
+		TP_PROTO(const struct extent_buffer *eb),	\
+								\
+		TP_ARGS(eb)					\
+)
+
+DEFINE_BTRFS_LOCK_EVENT(btrfs_tree_unlock);
+DEFINE_BTRFS_LOCK_EVENT(btrfs_tree_read_unlock);
+DEFINE_BTRFS_LOCK_EVENT(btrfs_tree_read_unlock_blocking);
+DEFINE_BTRFS_LOCK_EVENT(btrfs_set_lock_blocking_read);
+DEFINE_BTRFS_LOCK_EVENT(btrfs_set_lock_blocking_write);
+DEFINE_BTRFS_LOCK_EVENT(btrfs_clear_lock_blocking_read);
+DEFINE_BTRFS_LOCK_EVENT(btrfs_clear_lock_blocking_write);
+DEFINE_BTRFS_LOCK_EVENT(btrfs_try_tree_read_lock);
+DEFINE_BTRFS_LOCK_EVENT(btrfs_try_tree_write_lock);
+DEFINE_BTRFS_LOCK_EVENT(btrfs_tree_read_lock_atomic);
+
 #endif /* _TRACE_BTRFS_H */
 
 /* This part must be outside protection */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v3 0/2] btrfs: trace: Trace events for btrfs tree locking
  2019-04-15 13:15 [PATCH v3 0/2] btrfs: trace: Trace events for btrfs tree locking Qu Wenruo
  2019-04-15 13:15 ` [PATCH v3 1/2] btrfs: trace: Introduce trace events for sleepable tree lock Qu Wenruo
  2019-04-15 13:15 ` [PATCH v3 2/2] btrfs: trace: Introduce trace events for all btrfs tree locking events Qu Wenruo
@ 2019-04-15 14:28 ` David Sterba
  2 siblings, 0 replies; 4+ messages in thread
From: David Sterba @ 2019-04-15 14:28 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Mon, Apr 15, 2019 at 09:15:23PM +0800, Qu Wenruo wrote:
> This branch can be fetched from github:
> https://github.com/adam900710/linux/tree/trace_locking
> Which is based on v5.1-rc4 tag.
> 
> This patchset will introduce the following trace events for corresponding
> functions:
>   btrfs_tree_lock			(sleepable)
>   btrfs_tree_unlock
>   btrfs_tree_read_lock			(sleepable)
>   btrfs_tree_read_unlock
>   btrfs_tree_read_unlock_blocking
>   btrfs_set_lock_blocking_read
>   btrfs_set_lock_blocking_write
>   btrfs_clear_lock_blocking_read
>   btrfs_clear_lock_blocking_write
>   btrfs_try_tree_read_lock
>   btrfs_try_tree_write_lock
>   btrfs_tree_read_lock_atomic
> 
> The two sleepable ones will have extra info like start_ns, end_ns,
> diff_ns.
> Those two can be specially useful for tree lock wait time.
> There is tool to utilize those two events already:
> https://github.com/adam900710/btrfs-profiler/blob/master/tree_lock_wait.py
> 
> The non-sleepable can be useful for possible user space based lock
> leakage/dead lock detector.
> 
> Changelog:
> v2:
> - Hide all ktime_get_ns() call out of trace events
>   So there will be no overhead if trace events are not enabled.
> 
> v3:
> - More output for human to take a glance
> - More trace events for later bcc usage
>   Like lock leakage and deadlock detection.

V3 looks good to me, thanks. The enhanced output of the sleepable lock
is still sane.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-04-15 14:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-15 13:15 [PATCH v3 0/2] btrfs: trace: Trace events for btrfs tree locking Qu Wenruo
2019-04-15 13:15 ` [PATCH v3 1/2] btrfs: trace: Introduce trace events for sleepable tree lock Qu Wenruo
2019-04-15 13:15 ` [PATCH v3 2/2] btrfs: trace: Introduce trace events for all btrfs tree locking events Qu Wenruo
2019-04-15 14:28 ` [PATCH v3 0/2] btrfs: trace: Trace events for btrfs tree locking David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).