All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
@ 2009-04-24 21:47 ` Alan D. Brunelle
  0 siblings, 0 replies; 15+ messages in thread
From: Alan D. Brunelle @ 2009-04-24 21:47 UTC (permalink / raw)
  To: ryov; +Cc: dm-devel, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 5872 bytes --]

Hi Ryo -

I don't know if you are taking in patches, but whilst trying to uncover
some odd behavior I added some blktrace messages to dm-ioband-ctl.c. If
you're keeping one code base for old stuff (2.6.18-ish RHEL stuff) and
upstream you'll have to #if around these (the blktrace message stuff
came in around 2.6.26 or 27 I think).

My test case was to take a single 400GB storage device, put two 200GB
partitions on it and then see what the "penalty" or overhead for adding
dm-ioband on top. To do this I simply created an ext2 FS on each
partition in parallel (two processes each doing a mkfs to one of the
partitions). Then I put two dm-ioband devices on top of the two
partitions (setting the weight to 100 in both cases - thus they should
have equal access).

Using default values I was seeing /very/ large differences - on the
order of 3X. When I bumped the number of tokens to a large number
(10,240) the timings got much closer (<2%). I have found that using
weight-iosize performs worse than weight (closer to 5% penalty).

I'll try to formalize these results as I go forward and report out on
them. In any event, I thought I'd share this patch with you if you are
interested...

Here's a sampling from some blktrace output (sorry for the wrapping) - I
should note that I'm a bit scared to see such large numbers of holds
going on when the token count should be >5,000 for each device...
Holding these back in an equal access situation is inhibiting the block
I/O layer to merge (most) of these (as mkfs performs lots & lots of
small but sequential I/Os).

...
  8,80  16        0     0.090651446     0  m   N ioband 1 hold_nrm 1654
  8,80  16        0     0.090653575     0  m   N ioband 1 hold_nrm 1655
  8,80  16        0     0.090655694     0  m   N ioband 1 hold_nrm 1656
  8,80  16        0     0.090657609     0  m   N ioband 1 hold_nrm 1657
  8,80  16        0     0.090659554     0  m   N ioband 1 hold_nrm 1658
  8,80  16        0     0.090661327     0  m   N ioband 1 hold_nrm 1659
  8,80  16        0     0.090666237     0  m   N ioband 1 hold_nrm 1660
  8,80  16    53036     0.090675081  4713  C   W 391420657 + 1024 [0]
  8,80  16    53037     0.090913365  4713  D   W 392995569 + 1024
[mkfs.ext2]
  8,80  16        0     0.090950380     0  m   N ioband 1 add_iss 1659 1659
  8,80  16        0     0.090951296     0  m   N ioband 1 add_iss 1658 1658
  8,80  16        0     0.090951870     0  m   N ioband 1 add_iss 1657 1657
  8,80  16        0     0.090952416     0  m   N ioband 1 add_iss 1656 1656
  8,80  16        0     0.090952965     0  m   N ioband 1 add_iss 1655 1655
  8,80  16        0     0.090953517     0  m   N ioband 1 add_iss 1654 1654
  8,80  16        0     0.090954064     0  m   N ioband 1 add_iss 1653 1653
  8,80  16        0     0.090954610     0  m   N ioband 1 add_iss 1652 1652
  8,80  16        0     0.090955280     0  m   N ioband 1 add_iss 1651 1651
  8,80  16        0     0.090956495     0  m   N ioband 1 pop_iss
  8,80  16    53038     0.090957387  4659  A  WS 396655745 + 8 <- (8,82)
6030744
  8,80  16    53039     0.090957561  4659  Q  WS 396655745 + 8 [kioband/16]
  8,80  16    53040     0.090958328  4659  M  WS 396655745 + 8 [kioband/16]
  8,80  16        0     0.090959595     0  m   N ioband 1 pop_iss
  8,80  16    53041     0.090959754  4659  A  WS 396655753 + 8 <- (8,82)
6030752
  8,80  16    53042     0.090960007  4659  Q  WS 396655753 + 8 [kioband/16]
  8,80  16    53043     0.090960402  4659  M  WS 396655753 + 8 [kioband/16]
  8,80  16        0     0.090960962     0  m   N ioband 1 pop_iss
  8,80  16    53044     0.090961104  4659  A  WS 396655761 + 8 <- (8,82)
6030760
  8,80  16    53045     0.090961231  4659  Q  WS 396655761 + 8 [kioband/16]
  8,80  16    53046     0.090961496  4659  M  WS 396655761 + 8 [kioband/16]
  8,80  16        0     0.090961995     0  m   N ioband 1 pop_iss
  8,80  16    53047     0.090962117  4659  A  WS 396655769 + 8 <- (8,82)
6030768
  8,80  16    53048     0.090962222  4659  Q  WS 396655769 + 8 [kioband/16]
  8,80  16    53049     0.090962530  4659  M  WS 396655769 + 8 [kioband/16]
  8,80  16        0     0.090962974     0  m   N ioband 1 pop_iss
  8,80  16    53050     0.090963095  4659  A  WS 396655777 + 8 <- (8,82)
6030776
  8,80  16    53051     0.090963334  4659  Q  WS 396655777 + 8 [kioband/16]
  8,80  16    53052     0.090963518  4659  M  WS 396655777 + 8 [kioband/16]
  8,80  16        0     0.090963985     0  m   N ioband 1 pop_iss
  8,80  16    53053     0.090964220  4659  A  WS 396655785 + 8 <- (8,82)
6030784
  8,80  16    53054     0.090964327  4659  Q  WS 396655785 + 8 [kioband/16]
  8,80  16    53055     0.090964632  4659  M  WS 396655785 + 8 [kioband/16]
  8,80  16        0     0.090965094     0  m   N ioband 1 pop_iss
  8,80  16    53056     0.090965218  4659  A  WS 396655793 + 8 <- (8,82)
6030792
  8,80  16    53057     0.090965324  4659  Q  WS 396655793 + 8 [kioband/16]
  8,80  16    53058     0.090965548  4659  M  WS 396655793 + 8 [kioband/16]
  8,80  16        0     0.090965991     0  m   N ioband 1 pop_iss
  8,80  16    53059     0.090966112  4659  A  WS 396655801 + 8 <- (8,82)
6030800
  8,80  16    53060     0.090966221  4659  Q  WS 396655801 + 8 [kioband/16]
  8,80  16    53061     0.090966526  4659  M  WS 396655801 + 8 [kioband/16]
  8,80  16        0     0.090966944     0  m   N ioband 1 pop_iss
  8,80  16    53062     0.090967065  4659  A  WS 396655809 + 8 <- (8,82)
6030808
  8,80  16    53063     0.090967173  4659  Q  WS 396655809 + 8 [kioband/16]
  8,80  16    53064     0.090967383  4659  M  WS 396655809 + 8 [kioband/16]
  8,80  16        0     0.090968394     0  m   N ioband 1 add_iss 1650 1650
  8,80  16        0     0.090969068     0  m   N ioband 1 add_iss 1649 1649
  8,80  16        0     0.090969684     0  m   N ioband 1 add_iss 1648 1648
...

Regards,
Alan D. Brunelle
Hewlett-Packard

[-- Attachment #2: 0001-Added-in-blktrace-msgs-for-dm-ioband.patch --]
[-- Type: text/x-diff, Size: 3655 bytes --]

>From bd918c40d92e4f074763a88bc8f13593c4f2dc52 Mon Sep 17 00:00:00 2001
From: Alan D. Brunelle <alan.brunelle@hp.com>
Date: Fri, 24 Apr 2009 17:30:32 -0400
Subject: [PATCH] Added in blktrace msgs for dm-ioband

Added the following messages:

In hold_bio - added messages as they are being added to either the
urgent or normal hold lists.

  ioband <g_name> hold_urg <g_blocked>
  ioband <g_name> hold_nrm <g_blocked>

In make_issue_list - added messages when placing previously held bios
onto either the pushback or issue lists:

  ioband <g_name> add_pback <g_blocked> <c_blocked>
  ioband <g_name> add_iss <g_blocked> <c_blocked>

In release_urgent bios - added a message indicating that an urgent bio
was added to the issue list:

  ioband <g_name> urg_add_iss> <g_blocked>

In ioband_conduct - added messages as bios are being popped and executed
(sent to generic_make_request) or pushed back (bio_endio):

  ioband <g_name> pop_iss
  ioband <g_name> pop_pback

Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
---
 drivers/md/dm-ioband-ctl.c |   30 +++++++++++++++++++++++++++---
 1 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/md/dm-ioband-ctl.c b/drivers/md/dm-ioband-ctl.c
index 29bef11..26ad7a5 100644
--- a/drivers/md/dm-ioband-ctl.c
+++ b/drivers/md/dm-ioband-ctl.c
@@ -13,6 +13,7 @@
 #include <linux/slab.h>
 #include <linux/workqueue.h>
 #include <linux/rbtree.h>
+#include <linux/blktrace_api.h>
 #include "dm.h"
 #include "md.h"
 #include "dm-bio-list.h"
@@ -633,9 +634,15 @@ static void hold_bio(struct ioband_group *gp, struct bio *bio)
 		 */
 		dp->g_prepare_bio(gp, bio, IOBAND_URGENT);
 		bio_list_add(&dp->g_urgent_bios, bio);
+		blk_add_trace_msg(bdev_get_queue(bio->bi_bdev),
+				  "ioband %s hold_urg %d", dp->g_name,
+				  dp->g_blocked);
 	} else {
 		gp->c_blocked++;
 		dp->g_hold_bio(gp, bio);
+		blk_add_trace_msg(bdev_get_queue(bio->bi_bdev),
+				  "ioband %s hold_nrm %d", dp->g_name,
+				  gp->c_blocked);
 	}
 }
 
@@ -676,14 +683,21 @@ static int make_issue_list(struct ioband_group *gp, struct bio *bio,
 		clear_group_blocked(gp);
 		wake_up_all(&gp->c_waitq);
 	}
-	if (should_pushback_bio(gp))
+	if (should_pushback_bio(gp)) {
 		bio_list_add(pushback_list, bio);
+		blk_add_trace_msg(bdev_get_queue(bio->bi_bdev),
+				  "ioband %s add_pback %d %d", dp->g_name,
+				  dp->g_blocked, gp->c_blocked);
+	}
 	else {
 		int rw = bio_data_dir(bio);
 
 		gp->c_stat[rw].deferred++;
 		gp->c_stat[rw].sectors += bio_sectors(bio);
 		bio_list_add(issue_list, bio);
+		blk_add_trace_msg(bdev_get_queue(bio->bi_bdev),
+				  "ioband %s add_iss %d %d", dp->g_name,
+				  dp->g_blocked, gp->c_blocked);
 	}
 	return prepare_to_issue(gp, bio);
 }
@@ -703,6 +717,9 @@ static void release_urgent_bios(struct ioband_device *dp,
 		dp->g_blocked--;
 		dp->g_issued[bio_data_dir(bio)]++;
 		bio_list_add(issue_list, bio);
+		blk_add_trace_msg(bdev_get_queue(bio->bi_bdev),
+				  "ioband %s urg_add_iss %d", dp->g_name,
+				  dp->g_blocked);
 	}
 }
 
@@ -916,10 +933,17 @@ static void ioband_conduct(struct work_struct *work)
 
 	spin_unlock_irqrestore(&dp->g_lock, flags);
 
-	while ((bio = bio_list_pop(&issue_list)))
+	while ((bio = bio_list_pop(&issue_list))) {
+		blk_add_trace_msg(bdev_get_queue(bio->bi_bdev),
+				  "ioband %s pop_iss", dp->g_name);
 		generic_make_request(bio);
-	while ((bio = bio_list_pop(&pushback_list)))
+	}
+
+	while ((bio = bio_list_pop(&pushback_list))) {
+		blk_add_trace_msg(bdev_get_queue(bio->bi_bdev),
+				  "ioband %s pop_pback", dp->g_name);
 		bio_endio(bio, -EIO);
+	}
 }
 
 static int ioband_end_io(struct dm_target *ti, struct bio *bio,
-- 
1.5.6.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
@ 2009-04-24 21:47 ` Alan D. Brunelle
  0 siblings, 0 replies; 15+ messages in thread
From: Alan D. Brunelle @ 2009-04-24 21:47 UTC (permalink / raw)
  To: ryov; +Cc: dm-devel, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 5872 bytes --]

Hi Ryo -

I don't know if you are taking in patches, but whilst trying to uncover
some odd behavior I added some blktrace messages to dm-ioband-ctl.c. If
you're keeping one code base for old stuff (2.6.18-ish RHEL stuff) and
upstream you'll have to #if around these (the blktrace message stuff
came in around 2.6.26 or 27 I think).

My test case was to take a single 400GB storage device, put two 200GB
partitions on it and then see what the "penalty" or overhead for adding
dm-ioband on top. To do this I simply created an ext2 FS on each
partition in parallel (two processes each doing a mkfs to one of the
partitions). Then I put two dm-ioband devices on top of the two
partitions (setting the weight to 100 in both cases - thus they should
have equal access).

Using default values I was seeing /very/ large differences - on the
order of 3X. When I bumped the number of tokens to a large number
(10,240) the timings got much closer (<2%). I have found that using
weight-iosize performs worse than weight (closer to 5% penalty).

I'll try to formalize these results as I go forward and report out on
them. In any event, I thought I'd share this patch with you if you are
interested...

Here's a sampling from some blktrace output (sorry for the wrapping) - I
should note that I'm a bit scared to see such large numbers of holds
going on when the token count should be >5,000 for each device...
Holding these back in an equal access situation is inhibiting the block
I/O layer to merge (most) of these (as mkfs performs lots & lots of
small but sequential I/Os).

...
  8,80  16        0     0.090651446     0  m   N ioband 1 hold_nrm 1654
  8,80  16        0     0.090653575     0  m   N ioband 1 hold_nrm 1655
  8,80  16        0     0.090655694     0  m   N ioband 1 hold_nrm 1656
  8,80  16        0     0.090657609     0  m   N ioband 1 hold_nrm 1657
  8,80  16        0     0.090659554     0  m   N ioband 1 hold_nrm 1658
  8,80  16        0     0.090661327     0  m   N ioband 1 hold_nrm 1659
  8,80  16        0     0.090666237     0  m   N ioband 1 hold_nrm 1660
  8,80  16    53036     0.090675081  4713  C   W 391420657 + 1024 [0]
  8,80  16    53037     0.090913365  4713  D   W 392995569 + 1024
[mkfs.ext2]
  8,80  16        0     0.090950380     0  m   N ioband 1 add_iss 1659 1659
  8,80  16        0     0.090951296     0  m   N ioband 1 add_iss 1658 1658
  8,80  16        0     0.090951870     0  m   N ioband 1 add_iss 1657 1657
  8,80  16        0     0.090952416     0  m   N ioband 1 add_iss 1656 1656
  8,80  16        0     0.090952965     0  m   N ioband 1 add_iss 1655 1655
  8,80  16        0     0.090953517     0  m   N ioband 1 add_iss 1654 1654
  8,80  16        0     0.090954064     0  m   N ioband 1 add_iss 1653 1653
  8,80  16        0     0.090954610     0  m   N ioband 1 add_iss 1652 1652
  8,80  16        0     0.090955280     0  m   N ioband 1 add_iss 1651 1651
  8,80  16        0     0.090956495     0  m   N ioband 1 pop_iss
  8,80  16    53038     0.090957387  4659  A  WS 396655745 + 8 <- (8,82)
6030744
  8,80  16    53039     0.090957561  4659  Q  WS 396655745 + 8 [kioband/16]
  8,80  16    53040     0.090958328  4659  M  WS 396655745 + 8 [kioband/16]
  8,80  16        0     0.090959595     0  m   N ioband 1 pop_iss
  8,80  16    53041     0.090959754  4659  A  WS 396655753 + 8 <- (8,82)
6030752
  8,80  16    53042     0.090960007  4659  Q  WS 396655753 + 8 [kioband/16]
  8,80  16    53043     0.090960402  4659  M  WS 396655753 + 8 [kioband/16]
  8,80  16        0     0.090960962     0  m   N ioband 1 pop_iss
  8,80  16    53044     0.090961104  4659  A  WS 396655761 + 8 <- (8,82)
6030760
  8,80  16    53045     0.090961231  4659  Q  WS 396655761 + 8 [kioband/16]
  8,80  16    53046     0.090961496  4659  M  WS 396655761 + 8 [kioband/16]
  8,80  16        0     0.090961995     0  m   N ioband 1 pop_iss
  8,80  16    53047     0.090962117  4659  A  WS 396655769 + 8 <- (8,82)
6030768
  8,80  16    53048     0.090962222  4659  Q  WS 396655769 + 8 [kioband/16]
  8,80  16    53049     0.090962530  4659  M  WS 396655769 + 8 [kioband/16]
  8,80  16        0     0.090962974     0  m   N ioband 1 pop_iss
  8,80  16    53050     0.090963095  4659  A  WS 396655777 + 8 <- (8,82)
6030776
  8,80  16    53051     0.090963334  4659  Q  WS 396655777 + 8 [kioband/16]
  8,80  16    53052     0.090963518  4659  M  WS 396655777 + 8 [kioband/16]
  8,80  16        0     0.090963985     0  m   N ioband 1 pop_iss
  8,80  16    53053     0.090964220  4659  A  WS 396655785 + 8 <- (8,82)
6030784
  8,80  16    53054     0.090964327  4659  Q  WS 396655785 + 8 [kioband/16]
  8,80  16    53055     0.090964632  4659  M  WS 396655785 + 8 [kioband/16]
  8,80  16        0     0.090965094     0  m   N ioband 1 pop_iss
  8,80  16    53056     0.090965218  4659  A  WS 396655793 + 8 <- (8,82)
6030792
  8,80  16    53057     0.090965324  4659  Q  WS 396655793 + 8 [kioband/16]
  8,80  16    53058     0.090965548  4659  M  WS 396655793 + 8 [kioband/16]
  8,80  16        0     0.090965991     0  m   N ioband 1 pop_iss
  8,80  16    53059     0.090966112  4659  A  WS 396655801 + 8 <- (8,82)
6030800
  8,80  16    53060     0.090966221  4659  Q  WS 396655801 + 8 [kioband/16]
  8,80  16    53061     0.090966526  4659  M  WS 396655801 + 8 [kioband/16]
  8,80  16        0     0.090966944     0  m   N ioband 1 pop_iss
  8,80  16    53062     0.090967065  4659  A  WS 396655809 + 8 <- (8,82)
6030808
  8,80  16    53063     0.090967173  4659  Q  WS 396655809 + 8 [kioband/16]
  8,80  16    53064     0.090967383  4659  M  WS 396655809 + 8 [kioband/16]
  8,80  16        0     0.090968394     0  m   N ioband 1 add_iss 1650 1650
  8,80  16        0     0.090969068     0  m   N ioband 1 add_iss 1649 1649
  8,80  16        0     0.090969684     0  m   N ioband 1 add_iss 1648 1648
...

Regards,
Alan D. Brunelle
Hewlett-Packard

[-- Attachment #2: 0001-Added-in-blktrace-msgs-for-dm-ioband.patch --]
[-- Type: text/x-diff, Size: 3654 bytes --]

From bd918c40d92e4f074763a88bc8f13593c4f2dc52 Mon Sep 17 00:00:00 2001
From: Alan D. Brunelle <alan.brunelle@hp.com>
Date: Fri, 24 Apr 2009 17:30:32 -0400
Subject: [PATCH] Added in blktrace msgs for dm-ioband

Added the following messages:

In hold_bio - added messages as they are being added to either the
urgent or normal hold lists.

  ioband <g_name> hold_urg <g_blocked>
  ioband <g_name> hold_nrm <g_blocked>

In make_issue_list - added messages when placing previously held bios
onto either the pushback or issue lists:

  ioband <g_name> add_pback <g_blocked> <c_blocked>
  ioband <g_name> add_iss <g_blocked> <c_blocked>

In release_urgent bios - added a message indicating that an urgent bio
was added to the issue list:

  ioband <g_name> urg_add_iss> <g_blocked>

In ioband_conduct - added messages as bios are being popped and executed
(sent to generic_make_request) or pushed back (bio_endio):

  ioband <g_name> pop_iss
  ioband <g_name> pop_pback

Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
---
 drivers/md/dm-ioband-ctl.c |   30 +++++++++++++++++++++++++++---
 1 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/md/dm-ioband-ctl.c b/drivers/md/dm-ioband-ctl.c
index 29bef11..26ad7a5 100644
--- a/drivers/md/dm-ioband-ctl.c
+++ b/drivers/md/dm-ioband-ctl.c
@@ -13,6 +13,7 @@
 #include <linux/slab.h>
 #include <linux/workqueue.h>
 #include <linux/rbtree.h>
+#include <linux/blktrace_api.h>
 #include "dm.h"
 #include "md.h"
 #include "dm-bio-list.h"
@@ -633,9 +634,15 @@ static void hold_bio(struct ioband_group *gp, struct bio *bio)
 		 */
 		dp->g_prepare_bio(gp, bio, IOBAND_URGENT);
 		bio_list_add(&dp->g_urgent_bios, bio);
+		blk_add_trace_msg(bdev_get_queue(bio->bi_bdev),
+				  "ioband %s hold_urg %d", dp->g_name,
+				  dp->g_blocked);
 	} else {
 		gp->c_blocked++;
 		dp->g_hold_bio(gp, bio);
+		blk_add_trace_msg(bdev_get_queue(bio->bi_bdev),
+				  "ioband %s hold_nrm %d", dp->g_name,
+				  gp->c_blocked);
 	}
 }
 
@@ -676,14 +683,21 @@ static int make_issue_list(struct ioband_group *gp, struct bio *bio,
 		clear_group_blocked(gp);
 		wake_up_all(&gp->c_waitq);
 	}
-	if (should_pushback_bio(gp))
+	if (should_pushback_bio(gp)) {
 		bio_list_add(pushback_list, bio);
+		blk_add_trace_msg(bdev_get_queue(bio->bi_bdev),
+				  "ioband %s add_pback %d %d", dp->g_name,
+				  dp->g_blocked, gp->c_blocked);
+	}
 	else {
 		int rw = bio_data_dir(bio);
 
 		gp->c_stat[rw].deferred++;
 		gp->c_stat[rw].sectors += bio_sectors(bio);
 		bio_list_add(issue_list, bio);
+		blk_add_trace_msg(bdev_get_queue(bio->bi_bdev),
+				  "ioband %s add_iss %d %d", dp->g_name,
+				  dp->g_blocked, gp->c_blocked);
 	}
 	return prepare_to_issue(gp, bio);
 }
@@ -703,6 +717,9 @@ static void release_urgent_bios(struct ioband_device *dp,
 		dp->g_blocked--;
 		dp->g_issued[bio_data_dir(bio)]++;
 		bio_list_add(issue_list, bio);
+		blk_add_trace_msg(bdev_get_queue(bio->bi_bdev),
+				  "ioband %s urg_add_iss %d", dp->g_name,
+				  dp->g_blocked);
 	}
 }
 
@@ -916,10 +933,17 @@ static void ioband_conduct(struct work_struct *work)
 
 	spin_unlock_irqrestore(&dp->g_lock, flags);
 
-	while ((bio = bio_list_pop(&issue_list)))
+	while ((bio = bio_list_pop(&issue_list))) {
+		blk_add_trace_msg(bdev_get_queue(bio->bi_bdev),
+				  "ioband %s pop_iss", dp->g_name);
 		generic_make_request(bio);
-	while ((bio = bio_list_pop(&pushback_list)))
+	}
+
+	while ((bio = bio_list_pop(&pushback_list))) {
+		blk_add_trace_msg(bdev_get_queue(bio->bi_bdev),
+				  "ioband %s pop_pback", dp->g_name);
 		bio_endio(bio, -EIO);
+	}
 }
 
 static int ioband_end_io(struct dm_target *ti, struct bio *bio,
-- 
1.5.6.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
  2009-04-24 21:47 ` Alan D. Brunelle
  (?)
@ 2009-04-27  9:44 ` Ryo Tsuruta
  2009-05-04  3:24   ` Li Zefan
  -1 siblings, 1 reply; 15+ messages in thread
From: Ryo Tsuruta @ 2009-04-27  9:44 UTC (permalink / raw)
  To: Alan.Brunelle; +Cc: dm-devel, linux-kernel

Hi Alan,

From: "Alan D. Brunelle" <Alan.Brunelle@hp.com>
Subject: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
Date: Fri, 24 Apr 2009 17:47:37 -0400

> Hi Ryo -
> 
> I don't know if you are taking in patches, but whilst trying to uncover
> some odd behavior I added some blktrace messages to dm-ioband-ctl.c. If
> you're keeping one code base for old stuff (2.6.18-ish RHEL stuff) and
> upstream you'll have to #if around these (the blktrace message stuff
> came in around 2.6.26 or 27 I think).
> 
> My test case was to take a single 400GB storage device, put two 200GB
> partitions on it and then see what the "penalty" or overhead for adding
> dm-ioband on top. To do this I simply created an ext2 FS on each
> partition in parallel (two processes each doing a mkfs to one of the
> partitions). Then I put two dm-ioband devices on top of the two
> partitions (setting the weight to 100 in both cases - thus they should
> have equal access).
> 
> Using default values I was seeing /very/ large differences - on the
> order of 3X. When I bumped the number of tokens to a large number
> (10,240) the timings got much closer (<2%). I have found that using
> weight-iosize performs worse than weight (closer to 5% penalty).

I could reproduce similar results. One dm-ioband device seems to stop
issuing I/Os for a few seconds at times. I'll investigate more on that.
 
> I'll try to formalize these results as I go forward and report out on
> them. In any event, I thought I'd share this patch with you if you are
> interested...

Thanks. I'll include your patche into the next release.
 
> Here's a sampling from some blktrace output (sorry for the wrapping) - I
> should note that I'm a bit scared to see such large numbers of holds
> going on when the token count should be >5,000 for each device...
> Holding these back in an equal access situation is inhibiting the block
> I/O layer to merge (most) of these (as mkfs performs lots & lots of
> small but sequential I/Os).
> 
> ...
>   8,80  16        0     0.090651446     0  m   N ioband 1 hold_nrm 1654
>   8,80  16        0     0.090653575     0  m   N ioband 1 hold_nrm 1655
>   8,80  16        0     0.090655694     0  m   N ioband 1 hold_nrm 1656
>   8,80  16        0     0.090657609     0  m   N ioband 1 hold_nrm 1657
>   8,80  16        0     0.090659554     0  m   N ioband 1 hold_nrm 1658
>   8,80  16        0     0.090661327     0  m   N ioband 1 hold_nrm 1659
>   8,80  16        0     0.090666237     0  m   N ioband 1 hold_nrm 1660
>   8,80  16    53036     0.090675081  4713  C   W 391420657 + 1024 [0]
>   8,80  16    53037     0.090913365  4713  D   W 392995569 + 1024
> [mkfs.ext2]
>   8,80  16        0     0.090950380     0  m   N ioband 1 add_iss 1659 1659
>   8,80  16        0     0.090951296     0  m   N ioband 1 add_iss 1658 1658
>   8,80  16        0     0.090951870     0  m   N ioband 1 add_iss 1657 1657
>   8,80  16        0     0.090952416     0  m   N ioband 1 add_iss 1656 1656
>   8,80  16        0     0.090952965     0  m   N ioband 1 add_iss 1655 1655
>   8,80  16        0     0.090953517     0  m   N ioband 1 add_iss 1654 1654
>   8,80  16        0     0.090954064     0  m   N ioband 1 add_iss 1653 1653
>   8,80  16        0     0.090954610     0  m   N ioband 1 add_iss 1652 1652
>   8,80  16        0     0.090955280     0  m   N ioband 1 add_iss 1651 1651
>   8,80  16        0     0.090956495     0  m   N ioband 1 pop_iss
>   8,80  16    53038     0.090957387  4659  A  WS 396655745 + 8 <- (8,82)
> 6030744
>   8,80  16    53039     0.090957561  4659  Q  WS 396655745 + 8 [kioband/16]
>   8,80  16    53040     0.090958328  4659  M  WS 396655745 + 8 [kioband/16]
>   8,80  16        0     0.090959595     0  m   N ioband 1 pop_iss
>   8,80  16    53041     0.090959754  4659  A  WS 396655753 + 8 <- (8,82)
> 6030752
>   8,80  16    53042     0.090960007  4659  Q  WS 396655753 + 8 [kioband/16]
>   8,80  16    53043     0.090960402  4659  M  WS 396655753 + 8 [kioband/16]
>   8,80  16        0     0.090960962     0  m   N ioband 1 pop_iss
>   8,80  16    53044     0.090961104  4659  A  WS 396655761 + 8 <- (8,82)
> 6030760
>   8,80  16    53045     0.090961231  4659  Q  WS 396655761 + 8 [kioband/16]
>   8,80  16    53046     0.090961496  4659  M  WS 396655761 + 8 [kioband/16]
>   8,80  16        0     0.090961995     0  m   N ioband 1 pop_iss
>   8,80  16    53047     0.090962117  4659  A  WS 396655769 + 8 <- (8,82)
> 6030768
>   8,80  16    53048     0.090962222  4659  Q  WS 396655769 + 8 [kioband/16]
>   8,80  16    53049     0.090962530  4659  M  WS 396655769 + 8 [kioband/16]
>   8,80  16        0     0.090962974     0  m   N ioband 1 pop_iss
>   8,80  16    53050     0.090963095  4659  A  WS 396655777 + 8 <- (8,82)
> 6030776
>   8,80  16    53051     0.090963334  4659  Q  WS 396655777 + 8 [kioband/16]
>   8,80  16    53052     0.090963518  4659  M  WS 396655777 + 8 [kioband/16]
>   8,80  16        0     0.090963985     0  m   N ioband 1 pop_iss
>   8,80  16    53053     0.090964220  4659  A  WS 396655785 + 8 <- (8,82)
> 6030784
>   8,80  16    53054     0.090964327  4659  Q  WS 396655785 + 8 [kioband/16]
>   8,80  16    53055     0.090964632  4659  M  WS 396655785 + 8 [kioband/16]
>   8,80  16        0     0.090965094     0  m   N ioband 1 pop_iss
>   8,80  16    53056     0.090965218  4659  A  WS 396655793 + 8 <- (8,82)
> 6030792
>   8,80  16    53057     0.090965324  4659  Q  WS 396655793 + 8 [kioband/16]
>   8,80  16    53058     0.090965548  4659  M  WS 396655793 + 8 [kioband/16]
>   8,80  16        0     0.090965991     0  m   N ioband 1 pop_iss
>   8,80  16    53059     0.090966112  4659  A  WS 396655801 + 8 <- (8,82)
> 6030800
>   8,80  16    53060     0.090966221  4659  Q  WS 396655801 + 8 [kioband/16]
>   8,80  16    53061     0.090966526  4659  M  WS 396655801 + 8 [kioband/16]
>   8,80  16        0     0.090966944     0  m   N ioband 1 pop_iss
>   8,80  16    53062     0.090967065  4659  A  WS 396655809 + 8 <- (8,82)
> 6030808
>   8,80  16    53063     0.090967173  4659  Q  WS 396655809 + 8 [kioband/16]
>   8,80  16    53064     0.090967383  4659  M  WS 396655809 + 8 [kioband/16]
>   8,80  16        0     0.090968394     0  m   N ioband 1 add_iss 1650 1650
>   8,80  16        0     0.090969068     0  m   N ioband 1 add_iss 1649 1649
>   8,80  16        0     0.090969684     0  m   N ioband 1 add_iss 1648 1648
> ...
> 
> Regards,
> Alan D. Brunelle
> Hewlett-Packard

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
  2009-04-27  9:44 ` Ryo Tsuruta
@ 2009-05-04  3:24   ` Li Zefan
  2009-05-07  0:23       ` Ryo Tsuruta
  0 siblings, 1 reply; 15+ messages in thread
From: Li Zefan @ 2009-05-04  3:24 UTC (permalink / raw)
  To: Ryo Tsuruta; +Cc: Alan.Brunelle, dm-devel, linux-kernel

Ryo Tsuruta wrote:
> Hi Alan,
> 
>> Hi Ryo -
>>
>> I don't know if you are taking in patches, but whilst trying to uncover
>> some odd behavior I added some blktrace messages to dm-ioband-ctl.c. If
>> you're keeping one code base for old stuff (2.6.18-ish RHEL stuff) and
>> upstream you'll have to #if around these (the blktrace message stuff
>> came in around 2.6.26 or 27 I think).
>>
>> My test case was to take a single 400GB storage device, put two 200GB
>> partitions on it and then see what the "penalty" or overhead for adding
>> dm-ioband on top. To do this I simply created an ext2 FS on each
>> partition in parallel (two processes each doing a mkfs to one of the
>> partitions). Then I put two dm-ioband devices on top of the two
>> partitions (setting the weight to 100 in both cases - thus they should
>> have equal access).
>>
>> Using default values I was seeing /very/ large differences - on the
>> order of 3X. When I bumped the number of tokens to a large number
>> (10,240) the timings got much closer (<2%). I have found that using
>> weight-iosize performs worse than weight (closer to 5% penalty).
> 
> I could reproduce similar results. One dm-ioband device seems to stop
> issuing I/Os for a few seconds at times. I'll investigate more on that.
>  
>> I'll try to formalize these results as I go forward and report out on
>> them. In any event, I thought I'd share this patch with you if you are
>> interested...
> 
> Thanks. I'll include your patche into the next release.
>  

IMO we should use TRACE_EVENT instead of adding new blk_add_trace_msg().

>> Here's a sampling from some blktrace output (sorry for the wrapping) - I
>> should note that I'm a bit scared to see such large numbers of holds
>> going on when the token count should be >5,000 for each device...
>> Holding these back in an equal access situation is inhibiting the block
>> I/O layer to merge (most) of these (as mkfs performs lots & lots of
>> small but sequential I/Os).


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
  2009-05-04  3:24   ` Li Zefan
@ 2009-05-07  0:23       ` Ryo Tsuruta
  0 siblings, 0 replies; 15+ messages in thread
From: Ryo Tsuruta @ 2009-05-07  0:23 UTC (permalink / raw)
  To: lizf; +Cc: Alan.Brunelle, dm-devel, linux-kernel

Hi Li,

From: Li Zefan <lizf@cn.fujitsu.com>
Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
Date: Mon, 04 May 2009 11:24:27 +0800

> Ryo Tsuruta wrote:
> > Hi Alan,
> > 
> >> Hi Ryo -
> >>
> >> I don't know if you are taking in patches, but whilst trying to uncover
> >> some odd behavior I added some blktrace messages to dm-ioband-ctl.c. If
> >> you're keeping one code base for old stuff (2.6.18-ish RHEL stuff) and
> >> upstream you'll have to #if around these (the blktrace message stuff
> >> came in around 2.6.26 or 27 I think).
> >>
> >> My test case was to take a single 400GB storage device, put two 200GB
> >> partitions on it and then see what the "penalty" or overhead for adding
> >> dm-ioband on top. To do this I simply created an ext2 FS on each
> >> partition in parallel (two processes each doing a mkfs to one of the
> >> partitions). Then I put two dm-ioband devices on top of the two
> >> partitions (setting the weight to 100 in both cases - thus they should
> >> have equal access).
> >>
> >> Using default values I was seeing /very/ large differences - on the
> >> order of 3X. When I bumped the number of tokens to a large number
> >> (10,240) the timings got much closer (<2%). I have found that using
> >> weight-iosize performs worse than weight (closer to 5% penalty).
> > 
> > I could reproduce similar results. One dm-ioband device seems to stop
> > issuing I/Os for a few seconds at times. I'll investigate more on that.
> >  
> >> I'll try to formalize these results as I go forward and report out on
> >> them. In any event, I thought I'd share this patch with you if you are
> >> interested...
> > 
> > Thanks. I'll include your patche into the next release.
> >  
> 
> IMO we should use TRACE_EVENT instead of adding new blk_add_trace_msg().

Thanks for your suggestion. I'll use TRACE_EVENT instead.

> 
> >> Here's a sampling from some blktrace output (sorry for the wrapping) - I
> >> should note that I'm a bit scared to see such large numbers of holds
> >> going on when the token count should be >5,000 for each device...
> >> Holding these back in an equal access situation is inhibiting the block
> >> I/O layer to merge (most) of these (as mkfs performs lots & lots of
> >> small but sequential I/Os).
> 

Thanks,
Ryo Tsuruta

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
@ 2009-05-07  0:23       ` Ryo Tsuruta
  0 siblings, 0 replies; 15+ messages in thread
From: Ryo Tsuruta @ 2009-05-07  0:23 UTC (permalink / raw)
  To: lizf; +Cc: dm-devel, linux-kernel

Hi Li,

From: Li Zefan <lizf@cn.fujitsu.com>
Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
Date: Mon, 04 May 2009 11:24:27 +0800

> Ryo Tsuruta wrote:
> > Hi Alan,
> > 
> >> Hi Ryo -
> >>
> >> I don't know if you are taking in patches, but whilst trying to uncover
> >> some odd behavior I added some blktrace messages to dm-ioband-ctl.c. If
> >> you're keeping one code base for old stuff (2.6.18-ish RHEL stuff) and
> >> upstream you'll have to #if around these (the blktrace message stuff
> >> came in around 2.6.26 or 27 I think).
> >>
> >> My test case was to take a single 400GB storage device, put two 200GB
> >> partitions on it and then see what the "penalty" or overhead for adding
> >> dm-ioband on top. To do this I simply created an ext2 FS on each
> >> partition in parallel (two processes each doing a mkfs to one of the
> >> partitions). Then I put two dm-ioband devices on top of the two
> >> partitions (setting the weight to 100 in both cases - thus they should
> >> have equal access).
> >>
> >> Using default values I was seeing /very/ large differences - on the
> >> order of 3X. When I bumped the number of tokens to a large number
> >> (10,240) the timings got much closer (<2%). I have found that using
> >> weight-iosize performs worse than weight (closer to 5% penalty).
> > 
> > I could reproduce similar results. One dm-ioband device seems to stop
> > issuing I/Os for a few seconds at times. I'll investigate more on that.
> >  
> >> I'll try to formalize these results as I go forward and report out on
> >> them. In any event, I thought I'd share this patch with you if you are
> >> interested...
> > 
> > Thanks. I'll include your patche into the next release.
> >  
> 
> IMO we should use TRACE_EVENT instead of adding new blk_add_trace_msg().

Thanks for your suggestion. I'll use TRACE_EVENT instead.

> 
> >> Here's a sampling from some blktrace output (sorry for the wrapping) - I
> >> should note that I'm a bit scared to see such large numbers of holds
> >> going on when the token count should be >5,000 for each device...
> >> Holding these back in an equal access situation is inhibiting the block
> >> I/O layer to merge (most) of these (as mkfs performs lots & lots of
> >> small but sequential I/Os).
> 

Thanks,
Ryo Tsuruta

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
  2009-05-07  0:23       ` Ryo Tsuruta
  (?)
@ 2009-05-11 10:31       ` Ryo Tsuruta
  2009-05-12  3:49         ` Li Zefan
  -1 siblings, 1 reply; 15+ messages in thread
From: Ryo Tsuruta @ 2009-05-11 10:31 UTC (permalink / raw)
  To: lizf; +Cc: Alan.Brunelle, dm-devel, linux-kernel

Hi Li,

From: Ryo Tsuruta <ryov@valinux.co.jp>
Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
Date: Thu, 07 May 2009 09:23:22 +0900 (JST)

> Hi Li,
> 
> From: Li Zefan <lizf@cn.fujitsu.com>
> Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
> Date: Mon, 04 May 2009 11:24:27 +0800
> 
> > Ryo Tsuruta wrote:
> > > Hi Alan,
> > > 
> > >> Hi Ryo -
> > >>
> > >> I don't know if you are taking in patches, but whilst trying to uncover
> > >> some odd behavior I added some blktrace messages to dm-ioband-ctl.c. If
> > >> you're keeping one code base for old stuff (2.6.18-ish RHEL stuff) and
> > >> upstream you'll have to #if around these (the blktrace message stuff
> > >> came in around 2.6.26 or 27 I think).
> > >>
> > >> My test case was to take a single 400GB storage device, put two 200GB
> > >> partitions on it and then see what the "penalty" or overhead for adding
> > >> dm-ioband on top. To do this I simply created an ext2 FS on each
> > >> partition in parallel (two processes each doing a mkfs to one of the
> > >> partitions). Then I put two dm-ioband devices on top of the two
> > >> partitions (setting the weight to 100 in both cases - thus they should
> > >> have equal access).
> > >>
> > >> Using default values I was seeing /very/ large differences - on the
> > >> order of 3X. When I bumped the number of tokens to a large number
> > >> (10,240) the timings got much closer (<2%). I have found that using
> > >> weight-iosize performs worse than weight (closer to 5% penalty).
> > > 
> > > I could reproduce similar results. One dm-ioband device seems to stop
> > > issuing I/Os for a few seconds at times. I'll investigate more on that.
> > >  
> > >> I'll try to formalize these results as I go forward and report out on
> > >> them. In any event, I thought I'd share this patch with you if you are
> > >> interested...
> > > 
> > > Thanks. I'll include your patche into the next release.
> > >  
> > 
> > IMO we should use TRACE_EVENT instead of adding new blk_add_trace_msg().
> 
> Thanks for your suggestion. I'll use TRACE_EVENT instead.

blk_add_trace_msg() supports both blktrace and tracepoints. I can
get messages from dm-ioband through debugfs. Could you expain why
should we use TRACE_EVENT instead?

> > 
> > >> Here's a sampling from some blktrace output (sorry for the wrapping) - I
> > >> should note that I'm a bit scared to see such large numbers of holds
> > >> going on when the token count should be >5,000 for each device...
> > >> Holding these back in an equal access situation is inhibiting the block
> > >> I/O layer to merge (most) of these (as mkfs performs lots & lots of
> > >> small but sequential I/Os).
> > 
> 
> Thanks,
> Ryo Tsuruta

Thanks,
Ryo Tsuruta

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
  2009-05-11 10:31       ` Ryo Tsuruta
@ 2009-05-12  3:49         ` Li Zefan
  2009-05-12  6:11           ` Ryo Tsuruta
  0 siblings, 1 reply; 15+ messages in thread
From: Li Zefan @ 2009-05-12  3:49 UTC (permalink / raw)
  To: Ryo Tsuruta; +Cc: Alan.Brunelle, dm-devel, linux-kernel

Ryo Tsuruta wrote:
> Hi Li,
> 
> From: Ryo Tsuruta <ryov@valinux.co.jp>
> Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
> Date: Thu, 07 May 2009 09:23:22 +0900 (JST)
> 
>> Hi Li,
>>
>> From: Li Zefan <lizf@cn.fujitsu.com>
>> Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
>> Date: Mon, 04 May 2009 11:24:27 +0800
>>
>>> Ryo Tsuruta wrote:
>>>> Hi Alan,
>>>>
>>>>> Hi Ryo -
>>>>>
>>>>> I don't know if you are taking in patches, but whilst trying to uncover
>>>>> some odd behavior I added some blktrace messages to dm-ioband-ctl.c. If
>>>>> you're keeping one code base for old stuff (2.6.18-ish RHEL stuff) and
>>>>> upstream you'll have to #if around these (the blktrace message stuff
>>>>> came in around 2.6.26 or 27 I think).
>>>>>
>>>>> My test case was to take a single 400GB storage device, put two 200GB
>>>>> partitions on it and then see what the "penalty" or overhead for adding
>>>>> dm-ioband on top. To do this I simply created an ext2 FS on each
>>>>> partition in parallel (two processes each doing a mkfs to one of the
>>>>> partitions). Then I put two dm-ioband devices on top of the two
>>>>> partitions (setting the weight to 100 in both cases - thus they should
>>>>> have equal access).
>>>>>
>>>>> Using default values I was seeing /very/ large differences - on the
>>>>> order of 3X. When I bumped the number of tokens to a large number
>>>>> (10,240) the timings got much closer (<2%). I have found that using
>>>>> weight-iosize performs worse than weight (closer to 5% penalty).
>>>> I could reproduce similar results. One dm-ioband device seems to stop
>>>> issuing I/Os for a few seconds at times. I'll investigate more on that.
>>>>  
>>>>> I'll try to formalize these results as I go forward and report out on
>>>>> them. In any event, I thought I'd share this patch with you if you are
>>>>> interested...
>>>> Thanks. I'll include your patche into the next release.
>>>>  
>>> IMO we should use TRACE_EVENT instead of adding new blk_add_trace_msg().
>> Thanks for your suggestion. I'll use TRACE_EVENT instead.
> 
> blk_add_trace_msg() supports both blktrace and tracepoints. I can
> get messages from dm-ioband through debugfs. Could you expain why
> should we use TRACE_EVENT instead?
> 

Actually blk_add_trace_msg() has nothing to do with tracepoints..

If we use blk_add_trace_msg() is dm, we can use it in md, various block
drivers and even ext4. So the right thing is, if a subsystem wants to add
trace facility, it should use tracepoints/TRACE_EVENT.

With TRACE_EVENT, you can get output through debugfs too, and it can be used
together with blktrace:

 # echo 1 > /sys/block/dm/trace/enable
 # echo blk > /debugfs/tracing/current_tracer
 # echo dm-ioband-foo > /debugfs/tracing/tracing/set_event
 # cat /deubgfs/tracing/trace_pipe
 
And you can enable dm-ioband-foo while disabling dm-ioband-bar, and you can
use filter feature too.

>>>>> Here's a sampling from some blktrace output (sorry for the wrapping) - I
>>>>> should note that I'm a bit scared to see such large numbers of holds
>>>>> going on when the token count should be >5,000 for each device...
>>>>> Holding these back in an equal access situation is inhibiting the block
>>>>> I/O layer to merge (most) of these (as mkfs performs lots & lots of
>>>>> small but sequential I/Os).
>> Thanks,
>> Ryo Tsuruta
> 
> Thanks,
> Ryo Tsuruta
> 



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
  2009-05-12  3:49         ` Li Zefan
@ 2009-05-12  6:11           ` Ryo Tsuruta
  2009-05-12  8:10             ` Li Zefan
  0 siblings, 1 reply; 15+ messages in thread
From: Ryo Tsuruta @ 2009-05-12  6:11 UTC (permalink / raw)
  To: lizf; +Cc: Alan.Brunelle, dm-devel, linux-kernel

Hi Li,

From: Li Zefan <lizf@cn.fujitsu.com>
Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
Date: Tue, 12 May 2009 11:49:07 +0800

> Ryo Tsuruta wrote:
> > Hi Li,
> > 
> > From: Ryo Tsuruta <ryov@valinux.co.jp>
> > Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
> > Date: Thu, 07 May 2009 09:23:22 +0900 (JST)
> > 
> >> Hi Li,
> >>
> >> From: Li Zefan <lizf@cn.fujitsu.com>
> >> Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
> >> Date: Mon, 04 May 2009 11:24:27 +0800
> >>
> >>> Ryo Tsuruta wrote:
> >>>> Hi Alan,
> >>>>
> >>>>> Hi Ryo -
> >>>>>
> >>>>> I don't know if you are taking in patches, but whilst trying to uncover
> >>>>> some odd behavior I added some blktrace messages to dm-ioband-ctl.c. If
> >>>>> you're keeping one code base for old stuff (2.6.18-ish RHEL stuff) and
> >>>>> upstream you'll have to #if around these (the blktrace message stuff
> >>>>> came in around 2.6.26 or 27 I think).
> >>>>>
> >>>>> My test case was to take a single 400GB storage device, put two 200GB
> >>>>> partitions on it and then see what the "penalty" or overhead for adding
> >>>>> dm-ioband on top. To do this I simply created an ext2 FS on each
> >>>>> partition in parallel (two processes each doing a mkfs to one of the
> >>>>> partitions). Then I put two dm-ioband devices on top of the two
> >>>>> partitions (setting the weight to 100 in both cases - thus they should
> >>>>> have equal access).
> >>>>>
> >>>>> Using default values I was seeing /very/ large differences - on the
> >>>>> order of 3X. When I bumped the number of tokens to a large number
> >>>>> (10,240) the timings got much closer (<2%). I have found that using
> >>>>> weight-iosize performs worse than weight (closer to 5% penalty).
> >>>> I could reproduce similar results. One dm-ioband device seems to stop
> >>>> issuing I/Os for a few seconds at times. I'll investigate more on that.
> >>>>  
> >>>>> I'll try to formalize these results as I go forward and report out on
> >>>>> them. In any event, I thought I'd share this patch with you if you are
> >>>>> interested...
> >>>> Thanks. I'll include your patche into the next release.
> >>>>  
> >>> IMO we should use TRACE_EVENT instead of adding new blk_add_trace_msg().
> >> Thanks for your suggestion. I'll use TRACE_EVENT instead.
> > 
> > blk_add_trace_msg() supports both blktrace and tracepoints. I can
> > get messages from dm-ioband through debugfs. Could you expain why
> > should we use TRACE_EVENT instead?
> > 
> 
> Actually blk_add_trace_msg() has nothing to do with tracepoints..
> 
> If we use blk_add_trace_msg() is dm, we can use it in md, various block
> drivers and even ext4. So the right thing is, if a subsystem wants to add
> trace facility, it should use tracepoints/TRACE_EVENT.
> 
> With TRACE_EVENT, you can get output through debugfs too, and it can be used
> together with blktrace:
> 
>  # echo 1 > /sys/block/dm/trace/enable
>  # echo blk > /debugfs/tracing/current_tracer
>  # echo dm-ioband-foo > /debugfs/tracing/tracing/set_event
>  # cat /deubgfs/tracing/trace_pipe
>  
> And you can enable dm-ioband-foo while disabling dm-ioband-bar, and you can
> use filter feature too.

Thanks for explaining.
The base kernel of current dm tree (2.6.30-rc4) has not supported
dm-device tracing yet. I'll consider using TRACE_EVENT when the base
kernel supports dm-device tracing.

> 
> >>>>> Here's a sampling from some blktrace output (sorry for the wrapping) - I
> >>>>> should note that I'm a bit scared to see such large numbers of holds
> >>>>> going on when the token count should be >5,000 for each device...
> >>>>> Holding these back in an equal access situation is inhibiting the block
> >>>>> I/O layer to merge (most) of these (as mkfs performs lots & lots of
> >>>>> small but sequential I/Os).
> >> Thanks,
> >> Ryo Tsuruta
> > 
> > Thanks,
> > Ryo Tsuruta
> > 

Thanks,
Ryo Tsuruta

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
  2009-05-12  6:11           ` Ryo Tsuruta
@ 2009-05-12  8:10             ` Li Zefan
  2009-05-12 10:12                 ` Ryo Tsuruta
  0 siblings, 1 reply; 15+ messages in thread
From: Li Zefan @ 2009-05-12  8:10 UTC (permalink / raw)
  To: Ryo Tsuruta
  Cc: Alan.Brunelle, dm-devel, linux-kernel, Jens Axboe, Ingo Molnar,
	Steven Rostedt, Frederic Weisbecker

CC more people on tracing issue

Ryo Tsuruta wrote:
> Hi Li,
> 
> From: Li Zefan <lizf@cn.fujitsu.com>
> Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
> Date: Tue, 12 May 2009 11:49:07 +0800
> 
>> Ryo Tsuruta wrote:
>>> Hi Li,
>>>
>>> From: Ryo Tsuruta <ryov@valinux.co.jp>
>>> Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
>>> Date: Thu, 07 May 2009 09:23:22 +0900 (JST)
>>>
>>>> Hi Li,
>>>>
>>>> From: Li Zefan <lizf@cn.fujitsu.com>
>>>> Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
>>>> Date: Mon, 04 May 2009 11:24:27 +0800
>>>>
>>>>> Ryo Tsuruta wrote:
>>>>>> Hi Alan,
>>>>>>
>>>>>>> Hi Ryo -
>>>>>>>
>>>>>>> I don't know if you are taking in patches, but whilst trying to uncover
>>>>>>> some odd behavior I added some blktrace messages to dm-ioband-ctl.c. If
>>>>>>> you're keeping one code base for old stuff (2.6.18-ish RHEL stuff) and
>>>>>>> upstream you'll have to #if around these (the blktrace message stuff
>>>>>>> came in around 2.6.26 or 27 I think).
>>>>>>>
>>>>>>> My test case was to take a single 400GB storage device, put two 200GB
>>>>>>> partitions on it and then see what the "penalty" or overhead for adding
>>>>>>> dm-ioband on top. To do this I simply created an ext2 FS on each
>>>>>>> partition in parallel (two processes each doing a mkfs to one of the
>>>>>>> partitions). Then I put two dm-ioband devices on top of the two
>>>>>>> partitions (setting the weight to 100 in both cases - thus they should
>>>>>>> have equal access).
>>>>>>>
>>>>>>> Using default values I was seeing /very/ large differences - on the
>>>>>>> order of 3X. When I bumped the number of tokens to a large number
>>>>>>> (10,240) the timings got much closer (<2%). I have found that using
>>>>>>> weight-iosize performs worse than weight (closer to 5% penalty).
>>>>>> I could reproduce similar results. One dm-ioband device seems to stop
>>>>>> issuing I/Os for a few seconds at times. I'll investigate more on that.
>>>>>>  
>>>>>>> I'll try to formalize these results as I go forward and report out on
>>>>>>> them. In any event, I thought I'd share this patch with you if you are
>>>>>>> interested...
>>>>>> Thanks. I'll include your patche into the next release.
>>>>>>  
>>>>> IMO we should use TRACE_EVENT instead of adding new blk_add_trace_msg().
>>>> Thanks for your suggestion. I'll use TRACE_EVENT instead.
>>> blk_add_trace_msg() supports both blktrace and tracepoints. I can
>>> get messages from dm-ioband through debugfs. Could you expain why
>>> should we use TRACE_EVENT instead?
>>>
>> Actually blk_add_trace_msg() has nothing to do with tracepoints..
>>
>> If we use blk_add_trace_msg() is dm, we can use it in md, various block
>> drivers and even ext4. So the right thing is, if a subsystem wants to add
>> trace facility, it should use tracepoints/TRACE_EVENT.
>>
>> With TRACE_EVENT, you can get output through debugfs too, and it can be used
>> together with blktrace:
>>
>>  # echo 1 > /sys/block/dm/trace/enable
>>  # echo blk > /debugfs/tracing/current_tracer
>>  # echo dm-ioband-foo > /debugfs/tracing/tracing/set_event
>>  # cat /deubgfs/tracing/trace_pipe
>>  
>> And you can enable dm-ioband-foo while disabling dm-ioband-bar, and you can
>> use filter feature too.
> 
> Thanks for explaining.
> The base kernel of current dm tree (2.6.30-rc4) has not supported
> dm-device tracing yet. I'll consider using TRACE_EVENT when the base
> kernel supports dm-device tracing.
> 

I think we don't have any plan on dm-device tracing support, do we? And you
don't need that.

I downloaded dm-ioband 1.10.4, and applied it to latest tip-tree, and made
a patch to convert those blktrace msgs to trace events.

 # ls /debug/tracing/events/dm-ioband
 enable  ioband_endio     ioband_issue_list    ioband_urgent_bio
 filter  ioband_hold_bio  ioband_make_request
 # echo 1 /debug/tracing/events/dm-ioband/enable
 # echo blk /debug/tracing/current_tracer
 ...

Here is a sample output:

         pdflush-237   [001]  1706.377788: ioband_hold_bio: 8,9 ioband 1 hold_nrm 1
         pdflush-237   [001]  1706.377795: 254,0    Q   W 188832 + 8 [pdflush]
         pdflush-237   [001]  1706.377800: ioband_hold_bio: 8,9 ioband 1 hold_nrm 2
         pdflush-237   [001]  1706.377806: 254,0    Q   W 197032 + 8 [pdflush]
         pdflush-237   [001]  1706.377835: 254,0    C   W 82232 + 8 [0]
       kioband/1-2569  [001]  1706.377922: ioband_issue_list: 8,9 ioband 1 add_iss 1 1
       kioband/1-2569  [001]  1706.377923: ioband_issue_list: 8,9 ioband 1 add_iss 0 0
       kioband/1-2569  [001]  1706.377925: ioband_make_request: 8,9 ioband 1 pop_iss
       kioband/1-2569  [001]  1706.377940: ioband_make_request: 8,9 ioband 1 pop_iss
         konsole-2140  [001]  1706.377969: 254,0    C   W 90432 + 8 [0]
         konsole-2140  [001]  1706.378087: 254,0    C   W 98632 + 8 [0]
         konsole-2140  [001]  1706.378232: 254,0    C   W 106832 + 8 [0]
         konsole-2140  [001]  1706.378350: 254,0    C   W 115032 + 8 [0]
         konsole-2140  [001]  1706.378466: 254,0    C   W 123232 + 8 [0]
         konsole-2140  [001]  1706.378581: 254,0    C   W 131432 + 8 [0]

---
 drivers/md/dm-ioband-ctl.c       |   19 +++++-
 include/trace/events/dm-ioband.h |  134 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 150 insertions(+), 3 deletions(-)
 create mode 100644 include/trace/events/dm-ioband.h

diff --git a/drivers/md/dm-ioband-ctl.c b/drivers/md/dm-ioband-ctl.c
index 151b178..4033724 100644
--- a/drivers/md/dm-ioband-ctl.c
+++ b/drivers/md/dm-ioband-ctl.c
@@ -17,6 +17,9 @@
 #include "md.h"
 #include "dm-ioband.h"
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/dm-ioband.h>
+
 #define POLICY_PARAM_START 6
 #define POLICY_PARAM_DELIM "=:,"
 
@@ -632,9 +635,11 @@ static void hold_bio(struct ioband_group *gp, struct bio *bio)
 		 */
 		dp->g_prepare_bio(gp, bio, IOBAND_URGENT);
 		bio_list_add(&dp->g_urgent_bios, bio);
+		trace_ioband_hold_bio(bio, dp, true);
 	} else {
 		gp->c_blocked++;
 		dp->g_hold_bio(gp, bio);
+		trace_ioband_hold_bio(bio, dp, false);
 	}
 }
 
@@ -675,14 +680,17 @@ static int make_issue_list(struct ioband_group *gp, struct bio *bio,
 		clear_group_blocked(gp);
 		wake_up_all(&gp->c_waitq);
 	}
-	if (should_pushback_bio(gp))
+	if (should_pushback_bio(gp)) {
 		bio_list_add(pushback_list, bio);
+		trace_ioband_issue_list(bio, dp, gp, true);
+	}
 	else {
 		int rw = bio_data_dir(bio);
 
 		gp->c_stat[rw].deferred++;
 		gp->c_stat[rw].sectors += bio_sectors(bio);
 		bio_list_add(issue_list, bio);
+		trace_ioband_issue_list(bio, dp, gp, false);
 	}
 	return prepare_to_issue(gp, bio);
 }
@@ -702,6 +710,7 @@ static void release_urgent_bios(struct ioband_device *dp,
 		dp->g_blocked--;
 		dp->g_issued[bio_data_dir(bio)]++;
 		bio_list_add(issue_list, bio);
+		trace_ioband_urgent_bio(bio, dp);
 	}
 }
 
@@ -915,10 +924,14 @@ static void ioband_conduct(struct work_struct *work)
 
 	spin_unlock_irqrestore(&dp->g_lock, flags);
 
-	while ((bio = bio_list_pop(&issue_list)))
+	while ((bio = bio_list_pop(&issue_list))) {
+		trace_ioband_make_request(bio, dp);
 		generic_make_request(bio);
-	while ((bio = bio_list_pop(&pushback_list)))
+	}
+	while ((bio = bio_list_pop(&pushback_list))) {
+		trace_ioband_endio(bio, dp);
 		bio_endio(bio, -EIO);
+	}
 }
 
 static int ioband_end_io(struct dm_target *ti, struct bio *bio,
diff --git a/include/trace/events/dm-ioband.h b/include/trace/events/dm-ioband.h
new file mode 100644
index 0000000..17dd61c
--- /dev/null
+++ b/include/trace/events/dm-ioband.h
@@ -0,0 +1,134 @@
+#if !defined(_TRACE_DM_IOBAND_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_DM_IOBAND_H
+
+#include <linux/tracepoint.h>
+#include <linux/interrupt.h>
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM dm-ioband
+
+TRACE_EVENT(ioband_hold_bio,
+
+	TP_PROTO(struct bio *bio, struct ioband_device *dp, bool urg),
+
+	TP_ARGS(bio, dp, urg),
+
+	TP_STRUCT__entry(
+		__field(	dev_t,		dev		)
+		__field(	bool,		urg		)
+		__field(	int,		g_blocked	)
+		__string(	g_name,		dp->g_name	)
+	),
+
+	TP_fast_assign(
+		__entry->dev		= bio->bi_bdev->bd_dev;
+		__entry->urg		= urg;
+		__entry->g_blocked	= dp->g_blocked;
+		__assign_str(g_name, dp->g_name);
+	),
+
+	TP_printk("%d,%d ioband %s hold_%s %d",
+		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  __get_str(g_name),
+		  __entry->urg ? "urg" : "nrm", __entry->g_blocked)
+);
+
+TRACE_EVENT(ioband_issue_list,
+
+	TP_PROTO(struct bio *bio, struct ioband_device *dp,
+		 struct ioband_group *gp, bool pushback),
+
+	TP_ARGS(bio, dp, gp, pushback),
+
+	TP_STRUCT__entry(
+		__field(	dev_t,		dev		)
+		__field(	bool,		pushback	)
+		__field(	int,		g_blocked	)
+		__field(	int,		c_blocked	)
+		__string(	g_name,		dp->g_name	)
+	),
+
+	TP_fast_assign(
+		__entry->dev		= bio->bi_bdev->bd_dev;
+		__entry->pushback	= pushback;
+		__entry->g_blocked	= dp->g_blocked;
+		__entry->c_blocked	= gp->c_blocked;
+		__assign_str(g_name, dp->g_name);
+	),
+
+	TP_printk("%d,%d ioband %s add_%s %d %d",
+		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  __get_str(g_name),
+		  __entry->pushback ? "pback" : "iss",
+		  __entry->g_blocked, __entry->c_blocked)
+);
+
+TRACE_EVENT(ioband_urgent_bio,
+
+	TP_PROTO(struct bio *bio, struct ioband_device *dp),
+
+	TP_ARGS(bio, dp),
+
+	TP_STRUCT__entry(
+		__field(	dev_t,		dev		)
+		__field(	int,		g_blocked	)
+		__string(	g_name,		dp->g_name	)
+	),
+
+	TP_fast_assign(
+		__entry->dev		= bio->bi_bdev->bd_dev;
+		__entry->g_blocked	= dp->g_blocked;
+		__assign_str(g_name, dp->g_name);
+	),
+
+	TP_printk("%d,%d ioband %s urg_add_iss %d",
+		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  __get_str(g_name), __entry->g_blocked)
+);
+
+TRACE_EVENT(ioband_make_request,
+
+	TP_PROTO(struct bio *bio, struct ioband_device *dp),
+
+	TP_ARGS(bio, dp),
+
+	TP_STRUCT__entry(
+		__field(	dev_t,		dev		)
+		__string(	g_name,		dp->g_name	)
+	),
+
+	TP_fast_assign(
+		__entry->dev		= bio->bi_bdev->bd_dev;
+		__assign_str(g_name, dp->g_name);
+	),
+
+	TP_printk("%d,%d ioband %s pop_iss",
+		  MAJOR(__entry->dev), MINOR(__entry->dev), __get_str(g_name))
+);
+
+TRACE_EVENT(ioband_endio,
+
+	TP_PROTO(struct bio *bio, struct ioband_device *dp),
+
+	TP_ARGS(bio, dp),
+
+	TP_STRUCT__entry(
+		__field(	dev_t,		dev		)
+		__string(	g_name,		dp->g_name	)
+	),
+
+	TP_fast_assign(
+		__entry->dev		= bio->bi_bdev->bd_dev;
+		__assign_str(g_name, dp->g_name);
+	),
+
+	TP_printk("%d,%d ioband %s pop_pback",
+		  MAJOR(__entry->dev), MINOR(__entry->dev), __get_str(g_name))
+);
+
+#endif /*  _TRACE_DM_IOBAND_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
+
+
-- 
1.5.4.rc3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
  2009-05-12  8:10             ` Li Zefan
@ 2009-05-12 10:12                 ` Ryo Tsuruta
  0 siblings, 0 replies; 15+ messages in thread
From: Ryo Tsuruta @ 2009-05-12 10:12 UTC (permalink / raw)
  To: lizf
  Cc: Alan.Brunelle, dm-devel, linux-kernel, jens.axboe, mingo,
	rostedt, fweisbec

Hi Li,

From: Li Zefan <lizf@cn.fujitsu.com>
Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
Date: Tue, 12 May 2009 16:10:15 +0800

> CC more people on tracing issue
> 
> Ryo Tsuruta wrote:
> > Hi Li,
> > 
> > From: Li Zefan <lizf@cn.fujitsu.com>
> > Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
> > Date: Tue, 12 May 2009 11:49:07 +0800
> > 
> >> Ryo Tsuruta wrote:
> >>> Hi Li,
> >>>
> >>> From: Ryo Tsuruta <ryov@valinux.co.jp>
> >>> Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
> >>> Date: Thu, 07 May 2009 09:23:22 +0900 (JST)
> >>>
> >>>> Hi Li,
> >>>>
> >>>> From: Li Zefan <lizf@cn.fujitsu.com>
> >>>> Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
> >>>> Date: Mon, 04 May 2009 11:24:27 +0800
> >>>>
> >>>>> Ryo Tsuruta wrote:
> >>>>>> Hi Alan,
> >>>>>>
> >>>>>>> Hi Ryo -
> >>>>>>>
> >>>>>>> I don't know if you are taking in patches, but whilst trying to uncover
> >>>>>>> some odd behavior I added some blktrace messages to dm-ioband-ctl.c. If
> >>>>>>> you're keeping one code base for old stuff (2.6.18-ish RHEL stuff) and
> >>>>>>> upstream you'll have to #if around these (the blktrace message stuff
> >>>>>>> came in around 2.6.26 or 27 I think).
> >>>>>>>
> >>>>>>> My test case was to take a single 400GB storage device, put two 200GB
> >>>>>>> partitions on it and then see what the "penalty" or overhead for adding
> >>>>>>> dm-ioband on top. To do this I simply created an ext2 FS on each
> >>>>>>> partition in parallel (two processes each doing a mkfs to one of the
> >>>>>>> partitions). Then I put two dm-ioband devices on top of the two
> >>>>>>> partitions (setting the weight to 100 in both cases - thus they should
> >>>>>>> have equal access).
> >>>>>>>
> >>>>>>> Using default values I was seeing /very/ large differences - on the
> >>>>>>> order of 3X. When I bumped the number of tokens to a large number
> >>>>>>> (10,240) the timings got much closer (<2%). I have found that using
> >>>>>>> weight-iosize performs worse than weight (closer to 5% penalty).
> >>>>>> I could reproduce similar results. One dm-ioband device seems to stop
> >>>>>> issuing I/Os for a few seconds at times. I'll investigate more on that.
> >>>>>>  
> >>>>>>> I'll try to formalize these results as I go forward and report out on
> >>>>>>> them. In any event, I thought I'd share this patch with you if you are
> >>>>>>> interested...
> >>>>>> Thanks. I'll include your patche into the next release.
> >>>>>>  
> >>>>> IMO we should use TRACE_EVENT instead of adding new blk_add_trace_msg().
> >>>> Thanks for your suggestion. I'll use TRACE_EVENT instead.
> >>> blk_add_trace_msg() supports both blktrace and tracepoints. I can
> >>> get messages from dm-ioband through debugfs. Could you expain why
> >>> should we use TRACE_EVENT instead?
> >>>
> >> Actually blk_add_trace_msg() has nothing to do with tracepoints..
> >>
> >> If we use blk_add_trace_msg() is dm, we can use it in md, various block
> >> drivers and even ext4. So the right thing is, if a subsystem wants to add
> >> trace facility, it should use tracepoints/TRACE_EVENT.
> >>
> >> With TRACE_EVENT, you can get output through debugfs too, and it can be used
> >> together with blktrace:
> >>
> >>  # echo 1 > /sys/block/dm/trace/enable
> >>  # echo blk > /debugfs/tracing/current_tracer
> >>  # echo dm-ioband-foo > /debugfs/tracing/tracing/set_event
> >>  # cat /deubgfs/tracing/trace_pipe
> >>  
> >> And you can enable dm-ioband-foo while disabling dm-ioband-bar, and you can
> >> use filter feature too.
> > 
> > Thanks for explaining.
> > The base kernel of current dm tree (2.6.30-rc4) has not supported
> > dm-device tracing yet. I'll consider using TRACE_EVENT when the base
> > kernel supports dm-device tracing.
> > 
> 
> I think we don't have any plan on dm-device tracing support, do we? And you
> don't need that.
> 
> I downloaded dm-ioband 1.10.4, and applied it to latest tip-tree, and made
> a patch to convert those blktrace msgs to trace events.
> 
>  # ls /debug/tracing/events/dm-ioband
>  enable  ioband_endio     ioband_issue_list    ioband_urgent_bio
>  filter  ioband_hold_bio  ioband_make_request
>  # echo 1 /debug/tracing/events/dm-ioband/enable
>  # echo blk /debug/tracing/current_tracer
>  ...

Is the following line unnecessary in the latest tip-tree?

   # echo 1 > /sys/block/dm/trace/enable

How to enable/disable tracing on each device?
O.K. I'll try the latest tip-tree.
Thanks for your help!

> 
> Here is a sample output:
> 
>          pdflush-237   [001]  1706.377788: ioband_hold_bio: 8,9 ioband 1 hold_nrm 1
>          pdflush-237   [001]  1706.377795: 254,0    Q   W 188832 + 8 [pdflush]
>          pdflush-237   [001]  1706.377800: ioband_hold_bio: 8,9 ioband 1 hold_nrm 2
>          pdflush-237   [001]  1706.377806: 254,0    Q   W 197032 + 8 [pdflush]
>          pdflush-237   [001]  1706.377835: 254,0    C   W 82232 + 8 [0]
>        kioband/1-2569  [001]  1706.377922: ioband_issue_list: 8,9 ioband 1 add_iss 1 1
>        kioband/1-2569  [001]  1706.377923: ioband_issue_list: 8,9 ioband 1 add_iss 0 0
>        kioband/1-2569  [001]  1706.377925: ioband_make_request: 8,9 ioband 1 pop_iss
>        kioband/1-2569  [001]  1706.377940: ioband_make_request: 8,9 ioband 1 pop_iss
>          konsole-2140  [001]  1706.377969: 254,0    C   W 90432 + 8 [0]
>          konsole-2140  [001]  1706.378087: 254,0    C   W 98632 + 8 [0]
>          konsole-2140  [001]  1706.378232: 254,0    C   W 106832 + 8 [0]
>          konsole-2140  [001]  1706.378350: 254,0    C   W 115032 + 8 [0]
>          konsole-2140  [001]  1706.378466: 254,0    C   W 123232 + 8 [0]
>          konsole-2140  [001]  1706.378581: 254,0    C   W 131432 + 8 [0]
> 
> ---
>  drivers/md/dm-ioband-ctl.c       |   19 +++++-
>  include/trace/events/dm-ioband.h |  134 ++++++++++++++++++++++++++++++++++++++
>  2 files changed, 150 insertions(+), 3 deletions(-)
>  create mode 100644 include/trace/events/dm-ioband.h
> 
> diff --git a/drivers/md/dm-ioband-ctl.c b/drivers/md/dm-ioband-ctl.c
> index 151b178..4033724 100644
> --- a/drivers/md/dm-ioband-ctl.c
> +++ b/drivers/md/dm-ioband-ctl.c
> @@ -17,6 +17,9 @@
>  #include "md.h"
>  #include "dm-ioband.h"
>  
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/dm-ioband.h>
> +
>  #define POLICY_PARAM_START 6
>  #define POLICY_PARAM_DELIM "=:,"
>  
> @@ -632,9 +635,11 @@ static void hold_bio(struct ioband_group *gp, struct bio *bio)
>  		 */
>  		dp->g_prepare_bio(gp, bio, IOBAND_URGENT);
>  		bio_list_add(&dp->g_urgent_bios, bio);
> +		trace_ioband_hold_bio(bio, dp, true);
>  	} else {
>  		gp->c_blocked++;
>  		dp->g_hold_bio(gp, bio);
> +		trace_ioband_hold_bio(bio, dp, false);
>  	}
>  }
>  
> @@ -675,14 +680,17 @@ static int make_issue_list(struct ioband_group *gp, struct bio *bio,
>  		clear_group_blocked(gp);
>  		wake_up_all(&gp->c_waitq);
>  	}
> -	if (should_pushback_bio(gp))
> +	if (should_pushback_bio(gp)) {
>  		bio_list_add(pushback_list, bio);
> +		trace_ioband_issue_list(bio, dp, gp, true);
> +	}
>  	else {
>  		int rw = bio_data_dir(bio);
>  
>  		gp->c_stat[rw].deferred++;
>  		gp->c_stat[rw].sectors += bio_sectors(bio);
>  		bio_list_add(issue_list, bio);
> +		trace_ioband_issue_list(bio, dp, gp, false);
>  	}
>  	return prepare_to_issue(gp, bio);
>  }
> @@ -702,6 +710,7 @@ static void release_urgent_bios(struct ioband_device *dp,
>  		dp->g_blocked--;
>  		dp->g_issued[bio_data_dir(bio)]++;
>  		bio_list_add(issue_list, bio);
> +		trace_ioband_urgent_bio(bio, dp);
>  	}
>  }
>  
> @@ -915,10 +924,14 @@ static void ioband_conduct(struct work_struct *work)
>  
>  	spin_unlock_irqrestore(&dp->g_lock, flags);
>  
> -	while ((bio = bio_list_pop(&issue_list)))
> +	while ((bio = bio_list_pop(&issue_list))) {
> +		trace_ioband_make_request(bio, dp);
>  		generic_make_request(bio);
> -	while ((bio = bio_list_pop(&pushback_list)))
> +	}
> +	while ((bio = bio_list_pop(&pushback_list))) {
> +		trace_ioband_endio(bio, dp);
>  		bio_endio(bio, -EIO);
> +	}
>  }
>  
>  static int ioband_end_io(struct dm_target *ti, struct bio *bio,
> diff --git a/include/trace/events/dm-ioband.h b/include/trace/events/dm-ioband.h
> new file mode 100644
> index 0000000..17dd61c
> --- /dev/null
> +++ b/include/trace/events/dm-ioband.h
> @@ -0,0 +1,134 @@
> +#if !defined(_TRACE_DM_IOBAND_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_DM_IOBAND_H
> +
> +#include <linux/tracepoint.h>
> +#include <linux/interrupt.h>
> +
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM dm-ioband
> +
> +TRACE_EVENT(ioband_hold_bio,
> +
> +	TP_PROTO(struct bio *bio, struct ioband_device *dp, bool urg),
> +
> +	TP_ARGS(bio, dp, urg),
> +
> +	TP_STRUCT__entry(
> +		__field(	dev_t,		dev		)
> +		__field(	bool,		urg		)
> +		__field(	int,		g_blocked	)
> +		__string(	g_name,		dp->g_name	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->dev		= bio->bi_bdev->bd_dev;
> +		__entry->urg		= urg;
> +		__entry->g_blocked	= dp->g_blocked;
> +		__assign_str(g_name, dp->g_name);
> +	),
> +
> +	TP_printk("%d,%d ioband %s hold_%s %d",
> +		  MAJOR(__entry->dev), MINOR(__entry->dev),
> +		  __get_str(g_name),
> +		  __entry->urg ? "urg" : "nrm", __entry->g_blocked)
> +);
> +
> +TRACE_EVENT(ioband_issue_list,
> +
> +	TP_PROTO(struct bio *bio, struct ioband_device *dp,
> +		 struct ioband_group *gp, bool pushback),
> +
> +	TP_ARGS(bio, dp, gp, pushback),
> +
> +	TP_STRUCT__entry(
> +		__field(	dev_t,		dev		)
> +		__field(	bool,		pushback	)
> +		__field(	int,		g_blocked	)
> +		__field(	int,		c_blocked	)
> +		__string(	g_name,		dp->g_name	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->dev		= bio->bi_bdev->bd_dev;
> +		__entry->pushback	= pushback;
> +		__entry->g_blocked	= dp->g_blocked;
> +		__entry->c_blocked	= gp->c_blocked;
> +		__assign_str(g_name, dp->g_name);
> +	),
> +
> +	TP_printk("%d,%d ioband %s add_%s %d %d",
> +		  MAJOR(__entry->dev), MINOR(__entry->dev),
> +		  __get_str(g_name),
> +		  __entry->pushback ? "pback" : "iss",
> +		  __entry->g_blocked, __entry->c_blocked)
> +);
> +
> +TRACE_EVENT(ioband_urgent_bio,
> +
> +	TP_PROTO(struct bio *bio, struct ioband_device *dp),
> +
> +	TP_ARGS(bio, dp),
> +
> +	TP_STRUCT__entry(
> +		__field(	dev_t,		dev		)
> +		__field(	int,		g_blocked	)
> +		__string(	g_name,		dp->g_name	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->dev		= bio->bi_bdev->bd_dev;
> +		__entry->g_blocked	= dp->g_blocked;
> +		__assign_str(g_name, dp->g_name);
> +	),
> +
> +	TP_printk("%d,%d ioband %s urg_add_iss %d",
> +		  MAJOR(__entry->dev), MINOR(__entry->dev),
> +		  __get_str(g_name), __entry->g_blocked)
> +);
> +
> +TRACE_EVENT(ioband_make_request,
> +
> +	TP_PROTO(struct bio *bio, struct ioband_device *dp),
> +
> +	TP_ARGS(bio, dp),
> +
> +	TP_STRUCT__entry(
> +		__field(	dev_t,		dev		)
> +		__string(	g_name,		dp->g_name	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->dev		= bio->bi_bdev->bd_dev;
> +		__assign_str(g_name, dp->g_name);
> +	),
> +
> +	TP_printk("%d,%d ioband %s pop_iss",
> +		  MAJOR(__entry->dev), MINOR(__entry->dev), __get_str(g_name))
> +);
> +
> +TRACE_EVENT(ioband_endio,
> +
> +	TP_PROTO(struct bio *bio, struct ioband_device *dp),
> +
> +	TP_ARGS(bio, dp),
> +
> +	TP_STRUCT__entry(
> +		__field(	dev_t,		dev		)
> +		__string(	g_name,		dp->g_name	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->dev		= bio->bi_bdev->bd_dev;
> +		__assign_str(g_name, dp->g_name);
> +	),
> +
> +	TP_printk("%d,%d ioband %s pop_pback",
> +		  MAJOR(__entry->dev), MINOR(__entry->dev), __get_str(g_name))
> +);
> +
> +#endif /*  _TRACE_DM_IOBAND_H */
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>
> +
> +
> -- 
> 1.5.4.rc3
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
@ 2009-05-12 10:12                 ` Ryo Tsuruta
  0 siblings, 0 replies; 15+ messages in thread
From: Ryo Tsuruta @ 2009-05-12 10:12 UTC (permalink / raw)
  To: lizf; +Cc: fweisbec, linux-kernel, rostedt, dm-devel, jens.axboe, mingo

Hi Li,

From: Li Zefan <lizf@cn.fujitsu.com>
Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
Date: Tue, 12 May 2009 16:10:15 +0800

> CC more people on tracing issue
> 
> Ryo Tsuruta wrote:
> > Hi Li,
> > 
> > From: Li Zefan <lizf@cn.fujitsu.com>
> > Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
> > Date: Tue, 12 May 2009 11:49:07 +0800
> > 
> >> Ryo Tsuruta wrote:
> >>> Hi Li,
> >>>
> >>> From: Ryo Tsuruta <ryov@valinux.co.jp>
> >>> Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
> >>> Date: Thu, 07 May 2009 09:23:22 +0900 (JST)
> >>>
> >>>> Hi Li,
> >>>>
> >>>> From: Li Zefan <lizf@cn.fujitsu.com>
> >>>> Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
> >>>> Date: Mon, 04 May 2009 11:24:27 +0800
> >>>>
> >>>>> Ryo Tsuruta wrote:
> >>>>>> Hi Alan,
> >>>>>>
> >>>>>>> Hi Ryo -
> >>>>>>>
> >>>>>>> I don't know if you are taking in patches, but whilst trying to uncover
> >>>>>>> some odd behavior I added some blktrace messages to dm-ioband-ctl.c. If
> >>>>>>> you're keeping one code base for old stuff (2.6.18-ish RHEL stuff) and
> >>>>>>> upstream you'll have to #if around these (the blktrace message stuff
> >>>>>>> came in around 2.6.26 or 27 I think).
> >>>>>>>
> >>>>>>> My test case was to take a single 400GB storage device, put two 200GB
> >>>>>>> partitions on it and then see what the "penalty" or overhead for adding
> >>>>>>> dm-ioband on top. To do this I simply created an ext2 FS on each
> >>>>>>> partition in parallel (two processes each doing a mkfs to one of the
> >>>>>>> partitions). Then I put two dm-ioband devices on top of the two
> >>>>>>> partitions (setting the weight to 100 in both cases - thus they should
> >>>>>>> have equal access).
> >>>>>>>
> >>>>>>> Using default values I was seeing /very/ large differences - on the
> >>>>>>> order of 3X. When I bumped the number of tokens to a large number
> >>>>>>> (10,240) the timings got much closer (<2%). I have found that using
> >>>>>>> weight-iosize performs worse than weight (closer to 5% penalty).
> >>>>>> I could reproduce similar results. One dm-ioband device seems to stop
> >>>>>> issuing I/Os for a few seconds at times. I'll investigate more on that.
> >>>>>>  
> >>>>>>> I'll try to formalize these results as I go forward and report out on
> >>>>>>> them. In any event, I thought I'd share this patch with you if you are
> >>>>>>> interested...
> >>>>>> Thanks. I'll include your patche into the next release.
> >>>>>>  
> >>>>> IMO we should use TRACE_EVENT instead of adding new blk_add_trace_msg().
> >>>> Thanks for your suggestion. I'll use TRACE_EVENT instead.
> >>> blk_add_trace_msg() supports both blktrace and tracepoints. I can
> >>> get messages from dm-ioband through debugfs. Could you expain why
> >>> should we use TRACE_EVENT instead?
> >>>
> >> Actually blk_add_trace_msg() has nothing to do with tracepoints..
> >>
> >> If we use blk_add_trace_msg() is dm, we can use it in md, various block
> >> drivers and even ext4. So the right thing is, if a subsystem wants to add
> >> trace facility, it should use tracepoints/TRACE_EVENT.
> >>
> >> With TRACE_EVENT, you can get output through debugfs too, and it can be used
> >> together with blktrace:
> >>
> >>  # echo 1 > /sys/block/dm/trace/enable
> >>  # echo blk > /debugfs/tracing/current_tracer
> >>  # echo dm-ioband-foo > /debugfs/tracing/tracing/set_event
> >>  # cat /deubgfs/tracing/trace_pipe
> >>  
> >> And you can enable dm-ioband-foo while disabling dm-ioband-bar, and you can
> >> use filter feature too.
> > 
> > Thanks for explaining.
> > The base kernel of current dm tree (2.6.30-rc4) has not supported
> > dm-device tracing yet. I'll consider using TRACE_EVENT when the base
> > kernel supports dm-device tracing.
> > 
> 
> I think we don't have any plan on dm-device tracing support, do we? And you
> don't need that.
> 
> I downloaded dm-ioband 1.10.4, and applied it to latest tip-tree, and made
> a patch to convert those blktrace msgs to trace events.
> 
>  # ls /debug/tracing/events/dm-ioband
>  enable  ioband_endio     ioband_issue_list    ioband_urgent_bio
>  filter  ioband_hold_bio  ioband_make_request
>  # echo 1 /debug/tracing/events/dm-ioband/enable
>  # echo blk /debug/tracing/current_tracer
>  ...

Is the following line unnecessary in the latest tip-tree?

   # echo 1 > /sys/block/dm/trace/enable

How to enable/disable tracing on each device?
O.K. I'll try the latest tip-tree.
Thanks for your help!

> 
> Here is a sample output:
> 
>          pdflush-237   [001]  1706.377788: ioband_hold_bio: 8,9 ioband 1 hold_nrm 1
>          pdflush-237   [001]  1706.377795: 254,0    Q   W 188832 + 8 [pdflush]
>          pdflush-237   [001]  1706.377800: ioband_hold_bio: 8,9 ioband 1 hold_nrm 2
>          pdflush-237   [001]  1706.377806: 254,0    Q   W 197032 + 8 [pdflush]
>          pdflush-237   [001]  1706.377835: 254,0    C   W 82232 + 8 [0]
>        kioband/1-2569  [001]  1706.377922: ioband_issue_list: 8,9 ioband 1 add_iss 1 1
>        kioband/1-2569  [001]  1706.377923: ioband_issue_list: 8,9 ioband 1 add_iss 0 0
>        kioband/1-2569  [001]  1706.377925: ioband_make_request: 8,9 ioband 1 pop_iss
>        kioband/1-2569  [001]  1706.377940: ioband_make_request: 8,9 ioband 1 pop_iss
>          konsole-2140  [001]  1706.377969: 254,0    C   W 90432 + 8 [0]
>          konsole-2140  [001]  1706.378087: 254,0    C   W 98632 + 8 [0]
>          konsole-2140  [001]  1706.378232: 254,0    C   W 106832 + 8 [0]
>          konsole-2140  [001]  1706.378350: 254,0    C   W 115032 + 8 [0]
>          konsole-2140  [001]  1706.378466: 254,0    C   W 123232 + 8 [0]
>          konsole-2140  [001]  1706.378581: 254,0    C   W 131432 + 8 [0]
> 
> ---
>  drivers/md/dm-ioband-ctl.c       |   19 +++++-
>  include/trace/events/dm-ioband.h |  134 ++++++++++++++++++++++++++++++++++++++
>  2 files changed, 150 insertions(+), 3 deletions(-)
>  create mode 100644 include/trace/events/dm-ioband.h
> 
> diff --git a/drivers/md/dm-ioband-ctl.c b/drivers/md/dm-ioband-ctl.c
> index 151b178..4033724 100644
> --- a/drivers/md/dm-ioband-ctl.c
> +++ b/drivers/md/dm-ioband-ctl.c
> @@ -17,6 +17,9 @@
>  #include "md.h"
>  #include "dm-ioband.h"
>  
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/dm-ioband.h>
> +
>  #define POLICY_PARAM_START 6
>  #define POLICY_PARAM_DELIM "=:,"
>  
> @@ -632,9 +635,11 @@ static void hold_bio(struct ioband_group *gp, struct bio *bio)
>  		 */
>  		dp->g_prepare_bio(gp, bio, IOBAND_URGENT);
>  		bio_list_add(&dp->g_urgent_bios, bio);
> +		trace_ioband_hold_bio(bio, dp, true);
>  	} else {
>  		gp->c_blocked++;
>  		dp->g_hold_bio(gp, bio);
> +		trace_ioband_hold_bio(bio, dp, false);
>  	}
>  }
>  
> @@ -675,14 +680,17 @@ static int make_issue_list(struct ioband_group *gp, struct bio *bio,
>  		clear_group_blocked(gp);
>  		wake_up_all(&gp->c_waitq);
>  	}
> -	if (should_pushback_bio(gp))
> +	if (should_pushback_bio(gp)) {
>  		bio_list_add(pushback_list, bio);
> +		trace_ioband_issue_list(bio, dp, gp, true);
> +	}
>  	else {
>  		int rw = bio_data_dir(bio);
>  
>  		gp->c_stat[rw].deferred++;
>  		gp->c_stat[rw].sectors += bio_sectors(bio);
>  		bio_list_add(issue_list, bio);
> +		trace_ioband_issue_list(bio, dp, gp, false);
>  	}
>  	return prepare_to_issue(gp, bio);
>  }
> @@ -702,6 +710,7 @@ static void release_urgent_bios(struct ioband_device *dp,
>  		dp->g_blocked--;
>  		dp->g_issued[bio_data_dir(bio)]++;
>  		bio_list_add(issue_list, bio);
> +		trace_ioband_urgent_bio(bio, dp);
>  	}
>  }
>  
> @@ -915,10 +924,14 @@ static void ioband_conduct(struct work_struct *work)
>  
>  	spin_unlock_irqrestore(&dp->g_lock, flags);
>  
> -	while ((bio = bio_list_pop(&issue_list)))
> +	while ((bio = bio_list_pop(&issue_list))) {
> +		trace_ioband_make_request(bio, dp);
>  		generic_make_request(bio);
> -	while ((bio = bio_list_pop(&pushback_list)))
> +	}
> +	while ((bio = bio_list_pop(&pushback_list))) {
> +		trace_ioband_endio(bio, dp);
>  		bio_endio(bio, -EIO);
> +	}
>  }
>  
>  static int ioband_end_io(struct dm_target *ti, struct bio *bio,
> diff --git a/include/trace/events/dm-ioband.h b/include/trace/events/dm-ioband.h
> new file mode 100644
> index 0000000..17dd61c
> --- /dev/null
> +++ b/include/trace/events/dm-ioband.h
> @@ -0,0 +1,134 @@
> +#if !defined(_TRACE_DM_IOBAND_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_DM_IOBAND_H
> +
> +#include <linux/tracepoint.h>
> +#include <linux/interrupt.h>
> +
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM dm-ioband
> +
> +TRACE_EVENT(ioband_hold_bio,
> +
> +	TP_PROTO(struct bio *bio, struct ioband_device *dp, bool urg),
> +
> +	TP_ARGS(bio, dp, urg),
> +
> +	TP_STRUCT__entry(
> +		__field(	dev_t,		dev		)
> +		__field(	bool,		urg		)
> +		__field(	int,		g_blocked	)
> +		__string(	g_name,		dp->g_name	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->dev		= bio->bi_bdev->bd_dev;
> +		__entry->urg		= urg;
> +		__entry->g_blocked	= dp->g_blocked;
> +		__assign_str(g_name, dp->g_name);
> +	),
> +
> +	TP_printk("%d,%d ioband %s hold_%s %d",
> +		  MAJOR(__entry->dev), MINOR(__entry->dev),
> +		  __get_str(g_name),
> +		  __entry->urg ? "urg" : "nrm", __entry->g_blocked)
> +);
> +
> +TRACE_EVENT(ioband_issue_list,
> +
> +	TP_PROTO(struct bio *bio, struct ioband_device *dp,
> +		 struct ioband_group *gp, bool pushback),
> +
> +	TP_ARGS(bio, dp, gp, pushback),
> +
> +	TP_STRUCT__entry(
> +		__field(	dev_t,		dev		)
> +		__field(	bool,		pushback	)
> +		__field(	int,		g_blocked	)
> +		__field(	int,		c_blocked	)
> +		__string(	g_name,		dp->g_name	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->dev		= bio->bi_bdev->bd_dev;
> +		__entry->pushback	= pushback;
> +		__entry->g_blocked	= dp->g_blocked;
> +		__entry->c_blocked	= gp->c_blocked;
> +		__assign_str(g_name, dp->g_name);
> +	),
> +
> +	TP_printk("%d,%d ioband %s add_%s %d %d",
> +		  MAJOR(__entry->dev), MINOR(__entry->dev),
> +		  __get_str(g_name),
> +		  __entry->pushback ? "pback" : "iss",
> +		  __entry->g_blocked, __entry->c_blocked)
> +);
> +
> +TRACE_EVENT(ioband_urgent_bio,
> +
> +	TP_PROTO(struct bio *bio, struct ioband_device *dp),
> +
> +	TP_ARGS(bio, dp),
> +
> +	TP_STRUCT__entry(
> +		__field(	dev_t,		dev		)
> +		__field(	int,		g_blocked	)
> +		__string(	g_name,		dp->g_name	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->dev		= bio->bi_bdev->bd_dev;
> +		__entry->g_blocked	= dp->g_blocked;
> +		__assign_str(g_name, dp->g_name);
> +	),
> +
> +	TP_printk("%d,%d ioband %s urg_add_iss %d",
> +		  MAJOR(__entry->dev), MINOR(__entry->dev),
> +		  __get_str(g_name), __entry->g_blocked)
> +);
> +
> +TRACE_EVENT(ioband_make_request,
> +
> +	TP_PROTO(struct bio *bio, struct ioband_device *dp),
> +
> +	TP_ARGS(bio, dp),
> +
> +	TP_STRUCT__entry(
> +		__field(	dev_t,		dev		)
> +		__string(	g_name,		dp->g_name	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->dev		= bio->bi_bdev->bd_dev;
> +		__assign_str(g_name, dp->g_name);
> +	),
> +
> +	TP_printk("%d,%d ioband %s pop_iss",
> +		  MAJOR(__entry->dev), MINOR(__entry->dev), __get_str(g_name))
> +);
> +
> +TRACE_EVENT(ioband_endio,
> +
> +	TP_PROTO(struct bio *bio, struct ioband_device *dp),
> +
> +	TP_ARGS(bio, dp),
> +
> +	TP_STRUCT__entry(
> +		__field(	dev_t,		dev		)
> +		__string(	g_name,		dp->g_name	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->dev		= bio->bi_bdev->bd_dev;
> +		__assign_str(g_name, dp->g_name);
> +	),
> +
> +	TP_printk("%d,%d ioband %s pop_pback",
> +		  MAJOR(__entry->dev), MINOR(__entry->dev), __get_str(g_name))
> +);
> +
> +#endif /*  _TRACE_DM_IOBAND_H */
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>
> +
> +
> -- 
> 1.5.4.rc3
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
  2009-05-12 10:12                 ` Ryo Tsuruta
  (?)
@ 2009-05-13  0:56                 ` Li Zefan
  2009-05-13 11:30                     ` Ryo Tsuruta
  -1 siblings, 1 reply; 15+ messages in thread
From: Li Zefan @ 2009-05-13  0:56 UTC (permalink / raw)
  To: Ryo Tsuruta
  Cc: Alan.Brunelle, dm-devel, linux-kernel, jens.axboe, mingo,
	rostedt, fweisbec

>>>>>> Thanks for your suggestion. I'll use TRACE_EVENT instead.
>>>>> blk_add_trace_msg() supports both blktrace and tracepoints. I can
>>>>> get messages from dm-ioband through debugfs. Could you expain why
>>>>> should we use TRACE_EVENT instead?
>>>>>
>>>> Actually blk_add_trace_msg() has nothing to do with tracepoints..
>>>>
>>>> If we use blk_add_trace_msg() is dm, we can use it in md, various block
>>>> drivers and even ext4. So the right thing is, if a subsystem wants to add
>>>> trace facility, it should use tracepoints/TRACE_EVENT.
>>>>
>>>> With TRACE_EVENT, you can get output through debugfs too, and it can be used
>>>> together with blktrace:
>>>>
>>>>  # echo 1 > /sys/block/dm/trace/enable
>>>>  # echo blk > /debugfs/tracing/current_tracer
>>>>  # echo dm-ioband-foo > /debugfs/tracing/tracing/set_event
>>>>  # cat /deubgfs/tracing/trace_pipe
>>>>  
>>>> And you can enable dm-ioband-foo while disabling dm-ioband-bar, and you can
>>>> use filter feature too.
>>> Thanks for explaining.
>>> The base kernel of current dm tree (2.6.30-rc4) has not supported
>>> dm-device tracing yet. I'll consider using TRACE_EVENT when the base
>>> kernel supports dm-device tracing.
>>>
>> I think we don't have any plan on dm-device tracing support, do we? And you
>> don't need that.
>>
>> I downloaded dm-ioband 1.10.4, and applied it to latest tip-tree, and made
>> a patch to convert those blktrace msgs to trace events.
>>
>>  # ls /debug/tracing/events/dm-ioband
>>  enable  ioband_endio     ioband_issue_list    ioband_urgent_bio
>>  filter  ioband_hold_bio  ioband_make_request
>>  # echo 1 /debug/tracing/events/dm-ioband/enable
>>  # echo blk /debug/tracing/current_tracer
>>  ...
> 
> Is the following line unnecessary in the latest tip-tree?
> 
>    # echo 1 > /sys/block/dm/trace/enable
> 

Yes, still needed. I was not giving a complete demonstration. ;)

> How to enable/disable tracing on each device?

echo 1 > /sys/block/dm-0/trace/enable
echo 1 > /sys/block/dm-1/trace/enable

Or use filter facility:

echo "dev == $(( (ma0 << 20) | mi0 ))" > /debug/tracing/events/dm-ioband/filter

This will filter out all events generated from dm-1, leaving those from dm-0.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
  2009-05-13  0:56                 ` Li Zefan
@ 2009-05-13 11:30                     ` Ryo Tsuruta
  0 siblings, 0 replies; 15+ messages in thread
From: Ryo Tsuruta @ 2009-05-13 11:30 UTC (permalink / raw)
  To: lizf
  Cc: Alan.Brunelle, dm-devel, linux-kernel, jens.axboe, mingo,
	rostedt, fweisbec

Hi Li,

From: Li Zefan <lizf@cn.fujitsu.com>
Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
Date: Wed, 13 May 2009 08:56:13 +0800

> >>>>>> Thanks for your suggestion. I'll use TRACE_EVENT instead.
> >>>>> blk_add_trace_msg() supports both blktrace and tracepoints. I can
> >>>>> get messages from dm-ioband through debugfs. Could you expain why
> >>>>> should we use TRACE_EVENT instead?
> >>>>>
> >>>> Actually blk_add_trace_msg() has nothing to do with tracepoints..
> >>>>
> >>>> If we use blk_add_trace_msg() is dm, we can use it in md, various block
> >>>> drivers and even ext4. So the right thing is, if a subsystem wants to add
> >>>> trace facility, it should use tracepoints/TRACE_EVENT.
> >>>>
> >>>> With TRACE_EVENT, you can get output through debugfs too, and it can be used
> >>>> together with blktrace:
> >>>>
> >>>>  # echo 1 > /sys/block/dm/trace/enable
> >>>>  # echo blk > /debugfs/tracing/current_tracer
> >>>>  # echo dm-ioband-foo > /debugfs/tracing/tracing/set_event
> >>>>  # cat /deubgfs/tracing/trace_pipe
> >>>>  
> >>>> And you can enable dm-ioband-foo while disabling dm-ioband-bar, and you can
> >>>> use filter feature too.
> >>> Thanks for explaining.
> >>> The base kernel of current dm tree (2.6.30-rc4) has not supported
> >>> dm-device tracing yet. I'll consider using TRACE_EVENT when the base
> >>> kernel supports dm-device tracing.
> >>>
> >> I think we don't have any plan on dm-device tracing support, do we? And you
> >> don't need that.
> >>
> >> I downloaded dm-ioband 1.10.4, and applied it to latest tip-tree, and made
> >> a patch to convert those blktrace msgs to trace events.
> >>
> >>  # ls /debug/tracing/events/dm-ioband
> >>  enable  ioband_endio     ioband_issue_list    ioband_urgent_bio
> >>  filter  ioband_hold_bio  ioband_make_request
> >>  # echo 1 /debug/tracing/events/dm-ioband/enable
> >>  # echo blk /debug/tracing/current_tracer
> >>  ...
> > 
> > Is the following line unnecessary in the latest tip-tree?
> > 
> >    # echo 1 > /sys/block/dm/trace/enable
> > 
> 
> Yes, still needed. I was not giving a complete demonstration. ;)
> 
> > How to enable/disable tracing on each device?
> 
> echo 1 > /sys/block/dm-0/trace/enable
> echo 1 > /sys/block/dm-1/trace/enable
> 
> Or use filter facility:
> 
> echo "dev == $(( (ma0 << 20) | mi0 ))" > /debug/tracing/events/dm-ioband/filter
> 
> This will filter out all events generated from dm-1, leaving those from dm-0.
> 

Thanks for further explaining. I'll include your patche into the next
release.

Thanks,
Ryo Tsuruta

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
@ 2009-05-13 11:30                     ` Ryo Tsuruta
  0 siblings, 0 replies; 15+ messages in thread
From: Ryo Tsuruta @ 2009-05-13 11:30 UTC (permalink / raw)
  To: lizf; +Cc: fweisbec, linux-kernel, rostedt, dm-devel, jens.axboe, mingo

Hi Li,

From: Li Zefan <lizf@cn.fujitsu.com>
Subject: Re: [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband
Date: Wed, 13 May 2009 08:56:13 +0800

> >>>>>> Thanks for your suggestion. I'll use TRACE_EVENT instead.
> >>>>> blk_add_trace_msg() supports both blktrace and tracepoints. I can
> >>>>> get messages from dm-ioband through debugfs. Could you expain why
> >>>>> should we use TRACE_EVENT instead?
> >>>>>
> >>>> Actually blk_add_trace_msg() has nothing to do with tracepoints..
> >>>>
> >>>> If we use blk_add_trace_msg() is dm, we can use it in md, various block
> >>>> drivers and even ext4. So the right thing is, if a subsystem wants to add
> >>>> trace facility, it should use tracepoints/TRACE_EVENT.
> >>>>
> >>>> With TRACE_EVENT, you can get output through debugfs too, and it can be used
> >>>> together with blktrace:
> >>>>
> >>>>  # echo 1 > /sys/block/dm/trace/enable
> >>>>  # echo blk > /debugfs/tracing/current_tracer
> >>>>  # echo dm-ioband-foo > /debugfs/tracing/tracing/set_event
> >>>>  # cat /deubgfs/tracing/trace_pipe
> >>>>  
> >>>> And you can enable dm-ioband-foo while disabling dm-ioband-bar, and you can
> >>>> use filter feature too.
> >>> Thanks for explaining.
> >>> The base kernel of current dm tree (2.6.30-rc4) has not supported
> >>> dm-device tracing yet. I'll consider using TRACE_EVENT when the base
> >>> kernel supports dm-device tracing.
> >>>
> >> I think we don't have any plan on dm-device tracing support, do we? And you
> >> don't need that.
> >>
> >> I downloaded dm-ioband 1.10.4, and applied it to latest tip-tree, and made
> >> a patch to convert those blktrace msgs to trace events.
> >>
> >>  # ls /debug/tracing/events/dm-ioband
> >>  enable  ioband_endio     ioband_issue_list    ioband_urgent_bio
> >>  filter  ioband_hold_bio  ioband_make_request
> >>  # echo 1 /debug/tracing/events/dm-ioband/enable
> >>  # echo blk /debug/tracing/current_tracer
> >>  ...
> > 
> > Is the following line unnecessary in the latest tip-tree?
> > 
> >    # echo 1 > /sys/block/dm/trace/enable
> > 
> 
> Yes, still needed. I was not giving a complete demonstration. ;)
> 
> > How to enable/disable tracing on each device?
> 
> echo 1 > /sys/block/dm-0/trace/enable
> echo 1 > /sys/block/dm-1/trace/enable
> 
> Or use filter facility:
> 
> echo "dev == $(( (ma0 << 20) | mi0 ))" > /debug/tracing/events/dm-ioband/filter
> 
> This will filter out all events generated from dm-1, leaving those from dm-0.
> 

Thanks for further explaining. I'll include your patche into the next
release.

Thanks,
Ryo Tsuruta

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2009-05-13 11:30 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-24 21:47 [RFC PATCH dm-ioband] Added in blktrace msgs for dm-ioband Alan D. Brunelle
2009-04-24 21:47 ` Alan D. Brunelle
2009-04-27  9:44 ` Ryo Tsuruta
2009-05-04  3:24   ` Li Zefan
2009-05-07  0:23     ` Ryo Tsuruta
2009-05-07  0:23       ` Ryo Tsuruta
2009-05-11 10:31       ` Ryo Tsuruta
2009-05-12  3:49         ` Li Zefan
2009-05-12  6:11           ` Ryo Tsuruta
2009-05-12  8:10             ` Li Zefan
2009-05-12 10:12               ` Ryo Tsuruta
2009-05-12 10:12                 ` Ryo Tsuruta
2009-05-13  0:56                 ` Li Zefan
2009-05-13 11:30                   ` Ryo Tsuruta
2009-05-13 11:30                     ` Ryo Tsuruta

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.