Linux-ext4 Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 1/2] jbd2: add new tracepoint jbd2_sleep_on_shadow
@ 2019-09-02 14:54 Xiaoguang Wang
  2019-09-02 14:54 ` [PATCH 2/2] jbd2: add new tracepoint jbd2_wait_on_transaction_locked Xiaoguang Wang
  2019-09-07 16:21 ` [PATCH 1/2] jbd2: add new tracepoint jbd2_sleep_on_shadow Theodore Y. Ts'o
  0 siblings, 2 replies; 6+ messages in thread
From: Xiaoguang Wang @ 2019-09-02 14:54 UTC (permalink / raw)
  To: linux-ext4; +Cc: Xiaoguang Wang

Sometimes process will be stalled in "wait_on_bit_io(&bh->b_state,
BH_Shadow, TASK_UNINTERRUPTIBLE)" for a while, and in order to analyse
app's latency thoroughly, add a new tracepoint to track this delay.

Trace info likes below:
fsstress-5068  [008] .... 11007.757543: jbd2_sleep_on_shadow: dev 254,17 sleep 1
fsstress-5070  [007] .... 11007.757544: jbd2_sleep_on_shadow: dev 254,17 sleep 2
fsstress-5069  [009] .... 11007.757548: jbd2_sleep_on_shadow: dev 254,17 sleep 2
fsstress-5067  [011] .... 11007.757569: jbd2_sleep_on_shadow: dev 254,17 sleep 1
fsstress-5063  [007] .... 11007.757651: jbd2_sleep_on_shadow: dev 254,17 sleep 2
fsstress-5070  [007] .... 11007.757792: jbd2_sleep_on_shadow: dev 254,17 sleep 0
fsstress-5071  [011] .... 11007.763493: jbd2_sleep_on_shadow: dev 254,17 sleep 1

Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
---
 fs/jbd2/transaction.c       |  3 +++
 include/trace/events/jbd2.h | 21 +++++++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 990e7b5062e7..84974fb9d4f9 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -991,7 +991,10 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
 	if (buffer_shadow(bh)) {
 		JBUFFER_TRACE(jh, "on shadow: sleep");
 		jbd_unlock_bh_state(bh);
+		start_lock = jiffies;
 		wait_on_bit_io(&bh->b_state, BH_Shadow, TASK_UNINTERRUPTIBLE);
+		trace_jbd2_sleep_on_shadow(bh->b_bdev->bd_dev,
+			jiffies_to_msecs(jiffies - start_lock));
 		goto repeat;
 	}
 
diff --git a/include/trace/events/jbd2.h b/include/trace/events/jbd2.h
index 2310b259329f..2f048fdb63c6 100644
--- a/include/trace/events/jbd2.h
+++ b/include/trace/events/jbd2.h
@@ -380,6 +380,27 @@ TRACE_EVENT(jbd2_lock_buffer_stall,
 		__entry->stall_ms)
 );
 
+TRACE_EVENT(jbd2_sleep_on_shadow,
+
+	TP_PROTO(dev_t dev, unsigned long stall_ms),
+
+	TP_ARGS(dev, stall_ms),
+
+	TP_STRUCT__entry(
+		__field(        dev_t, dev	)
+		__field(unsigned long, stall_ms	)
+	),
+
+	TP_fast_assign(
+		__entry->dev		= dev;
+		__entry->stall_ms	= stall_ms;
+	),
+
+	TP_printk("dev %d,%d sleep %lu",
+		MAJOR(__entry->dev), MINOR(__entry->dev),
+		__entry->stall_ms)
+);
+
 #endif /* _TRACE_JBD2_H */
 
 /* This part must be outside protection */
-- 
2.17.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 2/2] jbd2: add new tracepoint jbd2_wait_on_transaction_locked
  2019-09-02 14:54 [PATCH 1/2] jbd2: add new tracepoint jbd2_sleep_on_shadow Xiaoguang Wang
@ 2019-09-02 14:54 ` Xiaoguang Wang
  2019-09-07 16:21 ` [PATCH 1/2] jbd2: add new tracepoint jbd2_sleep_on_shadow Theodore Y. Ts'o
  1 sibling, 0 replies; 6+ messages in thread
From: Xiaoguang Wang @ 2019-09-02 14:54 UTC (permalink / raw)
  To: linux-ext4; +Cc: Xiaoguang Wang

Sometimes process will be stalled in wait_transaction_locked() for a while,
also add a new tracepoint to track this delay.

Trace info likes below:
fsstress-1672  [009] ....   184.663043: jbd2_wait_on_transaction_locked: dev 254,17 wait 0
fsstress-1674  [002] ....   184.771556: jbd2_wait_on_transaction_locked: dev 254,17 wait 42
fsstress-1676  [005] ....   184.771562: jbd2_wait_on_transaction_locked: dev 254,17 wait 100
fsstress-1677  [003] ....   184.771567: jbd2_wait_on_transaction_locked: dev 254,17 wait 102
kworker/13:1-160   [013] ....   184.771619: jbd2_wait_on_transaction_locked: dev 254,17 wait 102
fsstress-1673  [003] ....   184.771675: jbd2_wait_on_transaction_locked: dev 254,17 wait 95

Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
---
 fs/jbd2/transaction.c       |  3 +++
 include/trace/events/jbd2.h | 21 +++++++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 84974fb9d4f9..43f4f7fadaec 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -148,6 +148,7 @@ static void wait_transaction_locked(journal_t *journal)
 	DEFINE_WAIT(wait);
 	int need_to_start;
 	tid_t tid = journal->j_running_transaction->t_tid;
+	unsigned long start = jiffies;
 
 	prepare_to_wait(&journal->j_wait_transaction_locked, &wait,
 			TASK_UNINTERRUPTIBLE);
@@ -158,6 +159,8 @@ static void wait_transaction_locked(journal_t *journal)
 	jbd2_might_wait_for_commit(journal);
 	schedule();
 	finish_wait(&journal->j_wait_transaction_locked, &wait);
+	trace_jbd2_wait_on_transaction_locked(journal->j_fs_dev->bd_dev,
+		jiffies_to_msecs(jiffies - start));
 }
 
 /*
diff --git a/include/trace/events/jbd2.h b/include/trace/events/jbd2.h
index 2f048fdb63c6..6f091f901223 100644
--- a/include/trace/events/jbd2.h
+++ b/include/trace/events/jbd2.h
@@ -401,6 +401,27 @@ TRACE_EVENT(jbd2_sleep_on_shadow,
 		__entry->stall_ms)
 );
 
+TRACE_EVENT(jbd2_wait_on_transaction_locked,
+
+	TP_PROTO(dev_t dev, unsigned long stall_ms),
+
+	TP_ARGS(dev, stall_ms),
+
+	TP_STRUCT__entry(
+		__field(        dev_t, dev	)
+		__field(unsigned long, stall_ms	)
+	),
+
+	TP_fast_assign(
+		__entry->dev		= dev;
+		__entry->stall_ms	= stall_ms;
+	),
+
+	TP_printk("dev %d,%d wait %lu",
+		MAJOR(__entry->dev), MINOR(__entry->dev),
+		__entry->stall_ms)
+);
+
 #endif /* _TRACE_JBD2_H */
 
 /* This part must be outside protection */
-- 
2.17.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] jbd2: add new tracepoint jbd2_sleep_on_shadow
  2019-09-02 14:54 [PATCH 1/2] jbd2: add new tracepoint jbd2_sleep_on_shadow Xiaoguang Wang
  2019-09-02 14:54 ` [PATCH 2/2] jbd2: add new tracepoint jbd2_wait_on_transaction_locked Xiaoguang Wang
@ 2019-09-07 16:21 ` Theodore Y. Ts'o
  2019-09-11  6:52   ` Xiaoguang Wang
  1 sibling, 1 reply; 6+ messages in thread
From: Theodore Y. Ts'o @ 2019-09-07 16:21 UTC (permalink / raw)
  To: Xiaoguang Wang; +Cc: linux-ext4

On Mon, Sep 02, 2019 at 10:54:41PM +0800, Xiaoguang Wang wrote:
> Sometimes process will be stalled in "wait_on_bit_io(&bh->b_state,
> BH_Shadow, TASK_UNINTERRUPTIBLE)" for a while, and in order to analyse
> app's latency thoroughly, add a new tracepoint to track this delay.
> 
> Trace info likes below:
> fsstress-5068  [008] .... 11007.757543: jbd2_sleep_on_shadow: dev 254,17 sleep 1
> fsstress-5070  [007] .... 11007.757544: jbd2_sleep_on_shadow: dev 254,17 sleep 2
> fsstress-5069  [009] .... 11007.757548: jbd2_sleep_on_shadow: dev 254,17 sleep 2
> fsstress-5067  [011] .... 11007.757569: jbd2_sleep_on_shadow: dev 254,17 sleep 1
> fsstress-5063  [007] .... 11007.757651: jbd2_sleep_on_shadow: dev 254,17 sleep 2
> fsstress-5070  [007] .... 11007.757792: jbd2_sleep_on_shadow: dev 254,17 sleep 0
> fsstress-5071  [011] .... 11007.763493: jbd2_sleep_on_shadow: dev 254,17 sleep 1
> 
> Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>

I think maybe it might be better to use units of microseconds and then
change sleep to usleep so the units are clear?  This is a spinlock, so
it should be quick.

For the other patch in this series, milliseconds seems fine, but if we
change the trace info to use "msleep" instead that would be clearer
--- or you could change it to use microseconds as well just for
consistency; I think either would be fine.

What do you think?

Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] jbd2: add new tracepoint jbd2_sleep_on_shadow
  2019-09-07 16:21 ` [PATCH 1/2] jbd2: add new tracepoint jbd2_sleep_on_shadow Theodore Y. Ts'o
@ 2019-09-11  6:52   ` Xiaoguang Wang
  2019-09-11 13:57     ` Theodore Y. Ts'o
  0 siblings, 1 reply; 6+ messages in thread
From: Xiaoguang Wang @ 2019-09-11  6:52 UTC (permalink / raw)
  To: Theodore Y. Ts'o; +Cc: linux-ext4

hi,

Thanks for reviewing.
> On Mon, Sep 02, 2019 at 10:54:41PM +0800, Xiaoguang Wang wrote:
>> Sometimes process will be stalled in "wait_on_bit_io(&bh->b_state,
>> BH_Shadow, TASK_UNINTERRUPTIBLE)" for a while, and in order to analyse
>> app's latency thoroughly, add a new tracepoint to track this delay.
>>
>> Trace info likes below:
>> fsstress-5068  [008] .... 11007.757543: jbd2_sleep_on_shadow: dev 254,17 sleep 1
>> fsstress-5070  [007] .... 11007.757544: jbd2_sleep_on_shadow: dev 254,17 sleep 2
>> fsstress-5069  [009] .... 11007.757548: jbd2_sleep_on_shadow: dev 254,17 sleep 2
>> fsstress-5067  [011] .... 11007.757569: jbd2_sleep_on_shadow: dev 254,17 sleep 1
>> fsstress-5063  [007] .... 11007.757651: jbd2_sleep_on_shadow: dev 254,17 sleep 2
>> fsstress-5070  [007] .... 11007.757792: jbd2_sleep_on_shadow: dev 254,17 sleep 0
>> fsstress-5071  [011] .... 11007.763493: jbd2_sleep_on_shadow: dev 254,17 sleep 1
>>
>> Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
> 
> I think maybe it might be better to use units of microseconds and then
> change sleep to usleep so the units are clear?  This is a spinlock, so
> it should be quick.
Sorry, I may not quite understand you, do you mean that milliseconds is not precise, so
should use microseconds? For these two patches, they do not use usleep or msleep to do
real sleep work, they just record the duration which process takes to wait bh_shadow flag
to be cleared or transaction to be unlocked.

Regards,
Xiaougang Wang

> 
> For the other patch in this series, milliseconds seems fine, but if we
> change the trace info to use "msleep" instead that would be clearer
> --- or you could change it to use microseconds as well just for
> consistency; I think either would be fine.
> 
> What do you think?
> 
> Cheers,
> 
> 						- Ted
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] jbd2: add new tracepoint jbd2_sleep_on_shadow
  2019-09-11  6:52   ` Xiaoguang Wang
@ 2019-09-11 13:57     ` Theodore Y. Ts'o
  2019-09-16 14:20       ` Xiaoguang Wang
  0 siblings, 1 reply; 6+ messages in thread
From: Theodore Y. Ts'o @ 2019-09-11 13:57 UTC (permalink / raw)
  To: Xiaoguang Wang; +Cc: linux-ext4

On Wed, Sep 11, 2019 at 02:52:51PM +0800, Xiaoguang Wang wrote:
> > I think maybe it might be better to use units of microseconds and then
> > change sleep to usleep so the units are clear?  This is a spinlock, so
> > it should be quick.
>
> Sorry, I may not quite understand you, do you mean that milliseconds is not precise, so
> should use microseconds? For these two patches, they do not use usleep or msleep to do
> real sleep work, they just record the duration which process takes to wait bh_shadow flag
> to be cleared or transaction to be unlocked.

Apologies, I should have been clear enough.  Yes, my concern that
milliseconds might not be fine-grained enough.  The sample results
which you showed had values of 2ms, 1ms, and 0ms.  And the single 0ms
result in particular raised the concern that we should use a
microseconds instead of milliseconds.

In fact, instead of "sleep", maybe "latency(us)" or "latency(ms)"
would be a better label?

Regards,

						- Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] jbd2: add new tracepoint jbd2_sleep_on_shadow
  2019-09-11 13:57     ` Theodore Y. Ts'o
@ 2019-09-16 14:20       ` Xiaoguang Wang
  0 siblings, 0 replies; 6+ messages in thread
From: Xiaoguang Wang @ 2019-09-16 14:20 UTC (permalink / raw)
  To: Theodore Y. Ts'o; +Cc: linux-ext4

hi,

> On Wed, Sep 11, 2019 at 02:52:51PM +0800, Xiaoguang Wang wrote:
>>> I think maybe it might be better to use units of microseconds and then
>>> change sleep to usleep so the units are clear?  This is a spinlock, so
>>> it should be quick.
>>
>> Sorry, I may not quite understand you, do you mean that milliseconds is not precise, so
>> should use microseconds? For these two patches, they do not use usleep or msleep to do
>> real sleep work, they just record the duration which process takes to wait bh_shadow flag
>> to be cleared or transaction to be unlocked.
> 
> Apologies, I should have been clear enough.  Yes, my concern that
> milliseconds might not be fine-grained enough.  The sample results
> which you showed had values of 2ms, 1ms, and 0ms.  And the single 0ms
> result in particular raised the concern that we should use a
> microseconds instead of milliseconds.
> 
> In fact, instead of "sleep", maybe "latency(us)" or "latency(ms)"
> would be a better label?
OK, I'll update a v2, thanks.

Regards,
Xiaoguang Wang

> 
> Regards,
> 
> 						- Ted
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, back to index

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-02 14:54 [PATCH 1/2] jbd2: add new tracepoint jbd2_sleep_on_shadow Xiaoguang Wang
2019-09-02 14:54 ` [PATCH 2/2] jbd2: add new tracepoint jbd2_wait_on_transaction_locked Xiaoguang Wang
2019-09-07 16:21 ` [PATCH 1/2] jbd2: add new tracepoint jbd2_sleep_on_shadow Theodore Y. Ts'o
2019-09-11  6:52   ` Xiaoguang Wang
2019-09-11 13:57     ` Theodore Y. Ts'o
2019-09-16 14:20       ` Xiaoguang Wang

Linux-ext4 Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-ext4/0 linux-ext4/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-ext4 linux-ext4/ https://lore.kernel.org/linux-ext4 \
		linux-ext4@vger.kernel.org linux-ext4@archiver.kernel.org
	public-inbox-index linux-ext4


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-ext4


AGPL code for this site: git clone https://public-inbox.org/ public-inbox