All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
@ 2011-05-31 22:03 Mikulas Patocka
  2011-06-01  6:18 ` Ankit Jain
  2011-06-01  9:51 ` Joe Thornber
  0 siblings, 2 replies; 23+ messages in thread
From: Mikulas Patocka @ 2011-05-31 22:03 UTC (permalink / raw)
  To: Alasdair G. Kergon, dm-devel

Hi

Here I'm sending new kcopyd throttling patches that allow the throttle to 
be set per module (i.e. one throttle for mirror and the other for 
snapshots).

Mikulas

---

dm-kcopyd: introduce per-module throttle structure

The structure contains the throttle parameter (it could be set in
/sys/module/*/parameters and auxulary variables for activity counting.

The throttle does nothing, it will be activated in the next patch.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/md/dm-kcopyd.c    |    2 +-
 drivers/md/dm-raid1.c     |    5 ++++-
 drivers/md/dm-snap.c      |    5 ++++-
 include/linux/dm-kcopyd.h |   15 ++++++++++++++-
 4 files changed, 23 insertions(+), 4 deletions(-)

Index: linux-2.6.39-fast/drivers/md/dm-kcopyd.c
===================================================================
--- linux-2.6.39-fast.orig/drivers/md/dm-kcopyd.c	2011-05-31 23:46:18.000000000 +0200
+++ linux-2.6.39-fast/drivers/md/dm-kcopyd.c	2011-05-31 23:49:47.000000000 +0200
@@ -617,7 +617,7 @@ int kcopyd_cancel(struct kcopyd_job *job
 /*-----------------------------------------------------------------
  * Client setup
  *---------------------------------------------------------------*/
-struct dm_kcopyd_client *dm_kcopyd_client_create(void)
+struct dm_kcopyd_client *dm_kcopyd_client_create(struct dm_kcopyd_throttle *throttle)
 {
 	int r = -ENOMEM;
 	struct dm_kcopyd_client *kc;
Index: linux-2.6.39-fast/drivers/md/dm-raid1.c
===================================================================
--- linux-2.6.39-fast.orig/drivers/md/dm-raid1.c	2011-05-31 23:46:18.000000000 +0200
+++ linux-2.6.39-fast/drivers/md/dm-raid1.c	2011-05-31 23:46:22.000000000 +0200
@@ -83,6 +83,9 @@ struct mirror_set {
 	struct mirror mirror[0];
 };
 
+dm_kcopyd_throttle_declare(raid1_resync_throttle,
+		"A percentage of time allocated for raid resynchronization");
+
 static void wakeup_mirrord(void *context)
 {
 	struct mirror_set *ms = context;
@@ -1115,7 +1118,7 @@ static int mirror_ctr(struct dm_target *
 		goto err_destroy_wq;
 	}
 
-	ms->kcopyd_client = dm_kcopyd_client_create();
+	ms->kcopyd_client = dm_kcopyd_client_create(&dm_kcopyd_throttle);
 	if (IS_ERR(ms->kcopyd_client)) {
 		r = PTR_ERR(ms->kcopyd_client);
 		goto err_destroy_wq;
Index: linux-2.6.39-fast/drivers/md/dm-snap.c
===================================================================
--- linux-2.6.39-fast.orig/drivers/md/dm-snap.c	2011-05-31 23:46:18.000000000 +0200
+++ linux-2.6.39-fast/drivers/md/dm-snap.c	2011-05-31 23:46:22.000000000 +0200
@@ -135,6 +135,9 @@ struct dm_snapshot {
 #define RUNNING_MERGE          0
 #define SHUTDOWN_MERGE         1
 
+dm_kcopyd_throttle_declare(snapshot_copy_throttle,
+		"A percentage of time allocated for copy on write");
+
 struct dm_dev *dm_snap_origin(struct dm_snapshot *s)
 {
 	return s->origin;
@@ -1111,7 +1114,7 @@ static int snapshot_ctr(struct dm_target
 		goto bad_hash_tables;
 	}
 
-	s->kcopyd_client = dm_kcopyd_client_create();
+	s->kcopyd_client = dm_kcopyd_client_create(&dm_kcopyd_throttle);
 	if (IS_ERR(s->kcopyd_client)) {
 		r = PTR_ERR(s->kcopyd_client);
 		ti->error = "Could not create kcopyd client";
Index: linux-2.6.39-fast/include/linux/dm-kcopyd.h
===================================================================
--- linux-2.6.39-fast.orig/include/linux/dm-kcopyd.h	2011-05-31 23:46:18.000000000 +0200
+++ linux-2.6.39-fast/include/linux/dm-kcopyd.h	2011-05-31 23:46:22.000000000 +0200
@@ -21,11 +21,24 @@
 
 #define DM_KCOPYD_IGNORE_ERROR 1
 
+struct dm_kcopyd_throttle {
+	unsigned throttle;
+	unsigned long num_io_jobs;
+	unsigned io_period;
+	unsigned total_period;
+	unsigned last_jiffies;
+};
+
+#define dm_kcopyd_throttle_declare(name, description)	\
+static struct dm_kcopyd_throttle dm_kcopyd_throttle = { 100, 0, 0, 0, 0 }; \
+module_param_named(name, dm_kcopyd_throttle.throttle, uint, 0644); \
+MODULE_PARM_DESC(name, description)
+
 /*
  * To use kcopyd you must first create a dm_kcopyd_client object.
  */
 struct dm_kcopyd_client;
-struct dm_kcopyd_client *dm_kcopyd_client_create(void);
+struct dm_kcopyd_client *dm_kcopyd_client_create(struct dm_kcopyd_throttle *throttle);
 void dm_kcopyd_client_destroy(struct dm_kcopyd_client *kc);
 
 /*

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-05-31 22:03 [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure Mikulas Patocka
@ 2011-06-01  6:18 ` Ankit Jain
  2011-06-01  7:13   ` Ankit Jain
  2011-06-02 19:16   ` Mikulas Patocka
  2011-06-01  9:51 ` Joe Thornber
  1 sibling, 2 replies; 23+ messages in thread
From: Ankit Jain @ 2011-06-01  6:18 UTC (permalink / raw)
  To: device-mapper development; +Cc: Mikulas Patocka, Alasdair G. Kergon

Just some comments, inline -

On 06/01/2011 03:33 AM, Mikulas Patocka wrote:
> Hi
> 
> Here I'm sending new kcopyd throttling patches that allow the throttle to 
> be set per module (i.e. one throttle for mirror and the other for 
> snapshots).
> 
> Mikulas
> 
> ---
> 
> dm-kcopyd: introduce per-module throttle structure
> 
> The structure contains the throttle parameter (it could be set in
> /sys/module/*/parameters and auxulary variables for activity counting.
> 
> The throttle does nothing, it will be activated in the next patch.
> 
> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> 
> ---
>  drivers/md/dm-kcopyd.c    |    2 +-
>  drivers/md/dm-raid1.c     |    5 ++++-
>  drivers/md/dm-snap.c      |    5 ++++-
>  include/linux/dm-kcopyd.h |   15 ++++++++++++++-
>  4 files changed, 23 insertions(+), 4 deletions(-)
> 
> Index: linux-2.6.39-fast/drivers/md/dm-kcopyd.c
> ===================================================================
> --- linux-2.6.39-fast.orig/drivers/md/dm-kcopyd.c	2011-05-31 23:46:18.000000000 +0200
> +++ linux-2.6.39-fast/drivers/md/dm-kcopyd.c	2011-05-31 23:49:47.000000000 +0200
> @@ -617,7 +617,7 @@ int kcopyd_cancel(struct kcopyd_job *job
>  /*-----------------------------------------------------------------
>   * Client setup
>   *---------------------------------------------------------------*/
> -struct dm_kcopyd_client *dm_kcopyd_client_create(void)
> +struct dm_kcopyd_client *dm_kcopyd_client_create(struct dm_kcopyd_throttle *throttle)

IMHO, the other patch should be the first one, and this change should be
part of it. Basically, the first patch should add throttle support and
the related infrastructure for using it, and the second one can add
*usage/declaration* of those throttles in raid1/snapshot etc.

[...]
>  {
>  	int r = -ENOMEM;
>  	struct dm_kcopyd_client *kc;
> Index: linux-2.6.39-fast/drivers/md/dm-raid1.c
> ===================================================================
> --- linux-2.6.39-fast.orig/drivers/md/dm-raid1.c	2011-05-31 23:46:18.000000000 +0200
> +++ linux-2.6.39-fast/drivers/md/dm-raid1.c	2011-05-31 23:46:22.000000000 +0200
> @@ -83,6 +83,9 @@ struct mirror_set {
>  	struct mirror mirror[0];
>  };
>  
> +dm_kcopyd_throttle_declare(raid1_resync_throttle,
> +		"A percentage of time allocated for raid resynchronization");
> +
>  static void wakeup_mirrord(void *context)
>  {
>  	struct mirror_set *ms = context;
> @@ -1115,7 +1118,7 @@ static int mirror_ctr(struct dm_target *
>  		goto err_destroy_wq;
>  	}
>  
> -	ms->kcopyd_client = dm_kcopyd_client_create();
> +	ms->kcopyd_client = dm_kcopyd_client_create(&dm_kcopyd_throttle);

IMHO, its not very clear where the dm_kcopyd_throttle variable here,
comes from, until you find the definition for dm_kcopyd_throttle_declare.

>  	if (IS_ERR(ms->kcopyd_client)) {
>  		r = PTR_ERR(ms->kcopyd_client);
>  		goto err_destroy_wq;
> Index: linux-2.6.39-fast/drivers/md/dm-snap.c
> ===================================================================
> --- linux-2.6.39-fast.orig/drivers/md/dm-snap.c	2011-05-31 23:46:18.000000000 +0200
> +++ linux-2.6.39-fast/drivers/md/dm-snap.c	2011-05-31 23:46:22.000000000 +0200
> @@ -135,6 +135,9 @@ struct dm_snapshot {
>  #define RUNNING_MERGE          0
>  #define SHUTDOWN_MERGE         1
>  
> +dm_kcopyd_throttle_declare(snapshot_copy_throttle,
> +		"A percentage of time allocated for copy on write");
> +
>  struct dm_dev *dm_snap_origin(struct dm_snapshot *s)
>  {
>  	return s->origin;
> @@ -1111,7 +1114,7 @@ static int snapshot_ctr(struct dm_target
>  		goto bad_hash_tables;
>  	}
>  
> -	s->kcopyd_client = dm_kcopyd_client_create();
> +	s->kcopyd_client = dm_kcopyd_client_create(&dm_kcopyd_throttle);
>  	if (IS_ERR(s->kcopyd_client)) {
>  		r = PTR_ERR(s->kcopyd_client);
>  		ti->error = "Could not create kcopyd client";
> Index: linux-2.6.39-fast/include/linux/dm-kcopyd.h
> ===================================================================
> --- linux-2.6.39-fast.orig/include/linux/dm-kcopyd.h	2011-05-31 23:46:18.000000000 +0200
> +++ linux-2.6.39-fast/include/linux/dm-kcopyd.h	2011-05-31 23:46:22.000000000 +0200
> @@ -21,11 +21,24 @@
>  
>  #define DM_KCOPYD_IGNORE_ERROR 1
>  
> +struct dm_kcopyd_throttle {
> +	unsigned throttle;
> +	unsigned long num_io_jobs;
> +	unsigned io_period;
> +	unsigned total_period;
> +	unsigned last_jiffies;
> +};
> +
> +#define dm_kcopyd_throttle_declare(name, description)	\
> +static struct dm_kcopyd_throttle dm_kcopyd_throttle = { 100, 0, 0, 0, 0 }; \
> +module_param_named(name, dm_kcopyd_throttle.throttle, uint, 0644); \
> +MODULE_PARM_DESC(name, description)
> +

I'm just trying to understand, how is it determined, when to use macros
with UPPER_CASE_LETTERS and when otherwise. To me, UPPER_CASE_LETTERS
makes it very clear that I'm using a macro, and since this seems to be
doing declaration etc also, that would seem helpful. But I don't know
the general style or reasoning followed in the kernel, hence the question.

>  /*
>   * To use kcopyd you must first create a dm_kcopyd_client object.
>   */
>  struct dm_kcopyd_client;
> -struct dm_kcopyd_client *dm_kcopyd_client_create(void);
> +struct dm_kcopyd_client *dm_kcopyd_client_create(struct dm_kcopyd_throttle *throttle);
>  void dm_kcopyd_client_destroy(struct dm_kcopyd_client *kc);
>  
>  /*
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
> 

Thanks,
-- 
Ankit Jain
SUSE Labs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-01  6:18 ` Ankit Jain
@ 2011-06-01  7:13   ` Ankit Jain
  2011-06-02 19:16   ` Mikulas Patocka
  1 sibling, 0 replies; 23+ messages in thread
From: Ankit Jain @ 2011-06-01  7:13 UTC (permalink / raw)
  To: device-mapper development; +Cc: Mikulas Patocka, Alasdair G. Kergon

On 06/01/2011 11:48 AM, Ankit Jain wrote:
>> +
>> +#define dm_kcopyd_throttle_declare(name, description)	\
>> +static struct dm_kcopyd_throttle dm_kcopyd_throttle = { 100, 0, 0, 0, 0 }; \
>> +module_param_named(name, dm_kcopyd_throttle.throttle, uint, 0644); \
>> +MODULE_PARM_DESC(name, description)
>> +
> 
> I'm just trying to understand, how is it determined, when to use macros
> with UPPER_CASE_LETTERS and when otherwise. To me, UPPER_CASE_LETTERS
> makes it very clear that I'm using a macro, and since this seems to be
> doing declaration etc also, that would seem helpful. But I don't know
> the general style or reasoning followed in the kernel, hence the question.
> 

According to Documentation/CodingStyle, a macro can be named in lower
case, if it resembles a function, which it doesn't in this case. Um and
maybe name it like DECLARE_.. ?

Thanks,
-- 
Ankit Jain
SUSE Labs

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-05-31 22:03 [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure Mikulas Patocka
  2011-06-01  6:18 ` Ankit Jain
@ 2011-06-01  9:51 ` Joe Thornber
  2011-06-02 19:55   ` Mikulas Patocka
  1 sibling, 1 reply; 23+ messages in thread
From: Joe Thornber @ 2011-06-01  9:51 UTC (permalink / raw)
  To: device-mapper development; +Cc: Alasdair G. Kergon

On Tue, May 31, 2011 at 06:03:39PM -0400, Mikulas Patocka wrote:
> Hi
> 
> Here I'm sending new kcopyd throttling patches that allow the throttle to 
> be set per module (i.e. one throttle for mirror and the other for 
> snapshots).

I'm sorry Mikulas, but these patches still really worry me.

i) Your throttling is time based, rather than throughput.

ii) No account is taken of the source and destination devices.  If I
have an idle mirror, I would like the resyncing of a new leg to go at
100% speed.  Other mirrors in the system may be under load so should
resync in a more sympathetic manner.  Should we be using the
congestion functions to see if the source or destination are too busy?
 
iii) calling msleep in kernel code really makes me shudder, especially
when you call it with a fixed, small duration and then loop.  Can't
you work out how long you should sleep for?  Why not defer the worker
thread until this time? (i.e. use queue_delayed_work()).

iv) you haven't explained how the sys admin works out the correct
throttle value.  If we write this down then maybe we can think about
automating it.

- Joe

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-01  6:18 ` Ankit Jain
  2011-06-01  7:13   ` Ankit Jain
@ 2011-06-02 19:16   ` Mikulas Patocka
  1 sibling, 0 replies; 23+ messages in thread
From: Mikulas Patocka @ 2011-06-02 19:16 UTC (permalink / raw)
  To: Ankit Jain; +Cc: device-mapper development, Alasdair G. Kergon



On Wed, 1 Jun 2011, Ankit Jain wrote:

> Just some comments, inline -
> 
> On 06/01/2011 03:33 AM, Mikulas Patocka wrote:
> > Hi
> > 
> > Here I'm sending new kcopyd throttling patches that allow the throttle to 
> > be set per module (i.e. one throttle for mirror and the other for 
> > snapshots).
> > 
> > Mikulas
> > 
> > ---
> > 
> > dm-kcopyd: introduce per-module throttle structure
> > 
> > The structure contains the throttle parameter (it could be set in
> > /sys/module/*/parameters and auxulary variables for activity counting.
> > 
> > The throttle does nothing, it will be activated in the next patch.
> > 
> > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
> > 
> > ---
> >  drivers/md/dm-kcopyd.c    |    2 +-
> >  drivers/md/dm-raid1.c     |    5 ++++-
> >  drivers/md/dm-snap.c      |    5 ++++-
> >  include/linux/dm-kcopyd.h |   15 ++++++++++++++-
> >  4 files changed, 23 insertions(+), 4 deletions(-)
> > 
> > Index: linux-2.6.39-fast/drivers/md/dm-kcopyd.c
> > ===================================================================
> > --- linux-2.6.39-fast.orig/drivers/md/dm-kcopyd.c	2011-05-31 23:46:18.000000000 +0200
> > +++ linux-2.6.39-fast/drivers/md/dm-kcopyd.c	2011-05-31 23:49:47.000000000 +0200
> > @@ -617,7 +617,7 @@ int kcopyd_cancel(struct kcopyd_job *job
> >  /*-----------------------------------------------------------------
> >   * Client setup
> >   *---------------------------------------------------------------*/
> > -struct dm_kcopyd_client *dm_kcopyd_client_create(void)
> > +struct dm_kcopyd_client *dm_kcopyd_client_create(struct dm_kcopyd_throttle *throttle)
> 
> IMHO, the other patch should be the first one, and this change should be
> part of it. Basically, the first patch should add throttle support and
> the related infrastructure for using it, and the second one can add
> *usage/declaration* of those throttles in raid1/snapshot etc.
> 
> [...]
> >  {
> >  	int r = -ENOMEM;
> >  	struct dm_kcopyd_client *kc;
> > Index: linux-2.6.39-fast/drivers/md/dm-raid1.c
> > ===================================================================
> > --- linux-2.6.39-fast.orig/drivers/md/dm-raid1.c	2011-05-31 23:46:18.000000000 +0200
> > +++ linux-2.6.39-fast/drivers/md/dm-raid1.c	2011-05-31 23:46:22.000000000 +0200
> > @@ -83,6 +83,9 @@ struct mirror_set {
> >  	struct mirror mirror[0];
> >  };
> >  
> > +dm_kcopyd_throttle_declare(raid1_resync_throttle,
> > +		"A percentage of time allocated for raid resynchronization");
> > +
> >  static void wakeup_mirrord(void *context)
> >  {
> >  	struct mirror_set *ms = context;
> > @@ -1115,7 +1118,7 @@ static int mirror_ctr(struct dm_target *
> >  		goto err_destroy_wq;
> >  	}
> >  
> > -	ms->kcopyd_client = dm_kcopyd_client_create();
> > +	ms->kcopyd_client = dm_kcopyd_client_create(&dm_kcopyd_throttle);
> 
> IMHO, its not very clear where the dm_kcopyd_throttle variable here,
> comes from, until you find the definition for dm_kcopyd_throttle_declare.
> 
> >  	if (IS_ERR(ms->kcopyd_client)) {
> >  		r = PTR_ERR(ms->kcopyd_client);
> >  		goto err_destroy_wq;
> > Index: linux-2.6.39-fast/drivers/md/dm-snap.c
> > ===================================================================
> > --- linux-2.6.39-fast.orig/drivers/md/dm-snap.c	2011-05-31 23:46:18.000000000 +0200
> > +++ linux-2.6.39-fast/drivers/md/dm-snap.c	2011-05-31 23:46:22.000000000 +0200
> > @@ -135,6 +135,9 @@ struct dm_snapshot {
> >  #define RUNNING_MERGE          0
> >  #define SHUTDOWN_MERGE         1
> >  
> > +dm_kcopyd_throttle_declare(snapshot_copy_throttle,
> > +		"A percentage of time allocated for copy on write");
> > +
> >  struct dm_dev *dm_snap_origin(struct dm_snapshot *s)
> >  {
> >  	return s->origin;
> > @@ -1111,7 +1114,7 @@ static int snapshot_ctr(struct dm_target
> >  		goto bad_hash_tables;
> >  	}
> >  
> > -	s->kcopyd_client = dm_kcopyd_client_create();
> > +	s->kcopyd_client = dm_kcopyd_client_create(&dm_kcopyd_throttle);
> >  	if (IS_ERR(s->kcopyd_client)) {
> >  		r = PTR_ERR(s->kcopyd_client);
> >  		ti->error = "Could not create kcopyd client";
> > Index: linux-2.6.39-fast/include/linux/dm-kcopyd.h
> > ===================================================================
> > --- linux-2.6.39-fast.orig/include/linux/dm-kcopyd.h	2011-05-31 23:46:18.000000000 +0200
> > +++ linux-2.6.39-fast/include/linux/dm-kcopyd.h	2011-05-31 23:46:22.000000000 +0200
> > @@ -21,11 +21,24 @@
> >  
> >  #define DM_KCOPYD_IGNORE_ERROR 1
> >  
> > +struct dm_kcopyd_throttle {
> > +	unsigned throttle;
> > +	unsigned long num_io_jobs;
> > +	unsigned io_period;
> > +	unsigned total_period;
> > +	unsigned last_jiffies;
> > +};
> > +
> > +#define dm_kcopyd_throttle_declare(name, description)	\
> > +static struct dm_kcopyd_throttle dm_kcopyd_throttle = { 100, 0, 0, 0, 0 }; \
> > +module_param_named(name, dm_kcopyd_throttle.throttle, uint, 0644); \
> > +MODULE_PARM_DESC(name, description)
> > +
> 
> I'm just trying to understand, how is it determined, when to use macros
> with UPPER_CASE_LETTERS and when otherwise. To me, UPPER_CASE_LETTERS
> makes it very clear that I'm using a macro, and since this seems to be
> doing declaration etc also, that would seem helpful. But I don't know
> the general style or reasoning followed in the kernel, hence the question.
> 
> >  /*
> >   * To use kcopyd you must first create a dm_kcopyd_client object.
> >   */
> >  struct dm_kcopyd_client;
> > -struct dm_kcopyd_client *dm_kcopyd_client_create(void);
> > +struct dm_kcopyd_client *dm_kcopyd_client_create(struct dm_kcopyd_throttle *throttle);
> >  void dm_kcopyd_client_destroy(struct dm_kcopyd_client *kc);
> >  
> >  /*
> > 
> > --
> > dm-devel mailing list
> > dm-devel@redhat.com
> > https://www.redhat.com/mailman/listinfo/dm-devel
> > 
> 
> Thanks,
> -- 
> Ankit Jain
> SUSE Labs

Hi

This is the updated patch with macro name in capital letters.

I think the order of patches doesn't matter much, the result will be the 
same anyway.

Mikulas

---

dm-kcopyd: introduce per-module throttle structure

The structure contains the throttle parameter (it could be set in
/sys/module/*/parameters and auxulary variables for activity counting.

The throttle does nothing, it will be activated in the next patch.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 drivers/md/dm-kcopyd.c    |    2 +-
 drivers/md/dm-raid1.c     |    5 ++++-
 drivers/md/dm-snap.c      |    5 ++++-
 include/linux/dm-kcopyd.h |   15 ++++++++++++++-
 4 files changed, 23 insertions(+), 4 deletions(-)

Index: linux-2.6.39-fast/drivers/md/dm-kcopyd.c
===================================================================
--- linux-2.6.39-fast.orig/drivers/md/dm-kcopyd.c	2011-05-31 23:46:18.000000000 +0200
+++ linux-2.6.39-fast/drivers/md/dm-kcopyd.c	2011-05-31 23:49:47.000000000 +0200
@@ -617,7 +617,7 @@ int kcopyd_cancel(struct kcopyd_job *job
 /*-----------------------------------------------------------------
  * Client setup
  *---------------------------------------------------------------*/
-struct dm_kcopyd_client *dm_kcopyd_client_create(void)
+struct dm_kcopyd_client *dm_kcopyd_client_create(struct dm_kcopyd_throttle *throttle)
 {
 	int r = -ENOMEM;
 	struct dm_kcopyd_client *kc;
Index: linux-2.6.39-fast/drivers/md/dm-raid1.c
===================================================================
--- linux-2.6.39-fast.orig/drivers/md/dm-raid1.c	2011-05-31 23:46:18.000000000 +0200
+++ linux-2.6.39-fast/drivers/md/dm-raid1.c	2011-05-31 23:46:22.000000000 +0200
@@ -83,6 +83,9 @@ struct mirror_set {
 	struct mirror mirror[0];
 };
 
+DECLARE_DM_KCOPYD_THROTTLE(raid1_resync_throttle,
+		"A percentage of time allocated for raid resynchronization");
+
 static void wakeup_mirrord(void *context)
 {
 	struct mirror_set *ms = context;
@@ -1115,7 +1118,7 @@ static int mirror_ctr(struct dm_target *
 		goto err_destroy_wq;
 	}
 
-	ms->kcopyd_client = dm_kcopyd_client_create();
+	ms->kcopyd_client = dm_kcopyd_client_create(&dm_kcopyd_throttle);
 	if (IS_ERR(ms->kcopyd_client)) {
 		r = PTR_ERR(ms->kcopyd_client);
 		goto err_destroy_wq;
Index: linux-2.6.39-fast/drivers/md/dm-snap.c
===================================================================
--- linux-2.6.39-fast.orig/drivers/md/dm-snap.c	2011-05-31 23:46:18.000000000 +0200
+++ linux-2.6.39-fast/drivers/md/dm-snap.c	2011-05-31 23:46:22.000000000 +0200
@@ -135,6 +135,9 @@ struct dm_snapshot {
 #define RUNNING_MERGE          0
 #define SHUTDOWN_MERGE         1
 
+DECLARE_DM_KCOPYD_THROTTLE(snapshot_copy_throttle,
+		"A percentage of time allocated for copy on write");
+
 struct dm_dev *dm_snap_origin(struct dm_snapshot *s)
 {
 	return s->origin;
@@ -1111,7 +1114,7 @@ static int snapshot_ctr(struct dm_target
 		goto bad_hash_tables;
 	}
 
-	s->kcopyd_client = dm_kcopyd_client_create();
+	s->kcopyd_client = dm_kcopyd_client_create(&dm_kcopyd_throttle);
 	if (IS_ERR(s->kcopyd_client)) {
 		r = PTR_ERR(s->kcopyd_client);
 		ti->error = "Could not create kcopyd client";
Index: linux-2.6.39-fast/include/linux/dm-kcopyd.h
===================================================================
--- linux-2.6.39-fast.orig/include/linux/dm-kcopyd.h	2011-05-31 23:46:18.000000000 +0200
+++ linux-2.6.39-fast/include/linux/dm-kcopyd.h	2011-05-31 23:46:22.000000000 +0200
@@ -21,11 +21,24 @@
 
 #define DM_KCOPYD_IGNORE_ERROR 1
 
+struct dm_kcopyd_throttle {
+	unsigned throttle;
+	unsigned long num_io_jobs;
+	unsigned io_period;
+	unsigned total_period;
+	unsigned last_jiffies;
+};
+
+#define DECLARE_DM_KCOPYD_THROTTLE(name, description)	\
+static struct dm_kcopyd_throttle dm_kcopyd_throttle = { 100, 0, 0, 0, 0 }; \
+module_param_named(name, dm_kcopyd_throttle.throttle, uint, 0644); \
+MODULE_PARM_DESC(name, description)
+
 /*
  * To use kcopyd you must first create a dm_kcopyd_client object.
  */
 struct dm_kcopyd_client;
-struct dm_kcopyd_client *dm_kcopyd_client_create(void);
+struct dm_kcopyd_client *dm_kcopyd_client_create(struct dm_kcopyd_throttle *throttle);
 void dm_kcopyd_client_destroy(struct dm_kcopyd_client *kc);
 
 /*

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-01  9:51 ` Joe Thornber
@ 2011-06-02 19:55   ` Mikulas Patocka
  2011-06-03 11:01     ` Joe Thornber
  0 siblings, 1 reply; 23+ messages in thread
From: Mikulas Patocka @ 2011-06-02 19:55 UTC (permalink / raw)
  Cc: dm-devel, Alasdair G. Kergon



On Wed, 1 Jun 2011, Joe Thornber wrote:

> On Tue, May 31, 2011 at 06:03:39PM -0400, Mikulas Patocka wrote:
> > Hi
> > 
> > Here I'm sending new kcopyd throttling patches that allow the throttle to 
> > be set per module (i.e. one throttle for mirror and the other for 
> > snapshots).
> 
> I'm sorry Mikulas, but these patches still really worry me.
> 
> i) Your throttling is time based, rather than throughput.

It is impossible to determine what throughput the underlying device has.

> ii) No account is taken of the source and destination devices.  If I
> have an idle mirror, I would like the resyncing of a new leg to go at
> 100% speed.  Other mirrors in the system may be under load so should
> resync in a more sympathetic manner.  Should we be using the
> congestion functions to see if the source or destination are too busy?

You don't know how much I/O is there on the underlying device.

If you want this kind of control, ask the people who develop the I/O 
scheduler.

> iii) calling msleep in kernel code really makes me shudder, especially
> when you call it with a fixed, small duration and then loop.  Can't
> you work out how long you should sleep for?

I tried to sleep shorter time more often, but it didn't work. If you sleep 
10 times for 10ms, it doesn't limit disk throughput much as if you sleep 
once for 100ms. I don't know why, but smaller sleeps don't work (a 
possible reason could be readahead/writeback cache in the disk, but if I 
disabled it setting WCE=0, RCD=1, smaller sleeps still didn't limit).

> Why not defer the worker
> thread until this time? (i.e. use queue_delayed_work()).

Because msleep is simpler.

> iv) you haven't explained how the sys admin works out the correct
> throttle value.

There is no "correct" value. The "correct" value depends on how important 
is copying itself v.s. other i/o.

> If we write this down then maybe we can think about automating it.

In theory (if disk scheduler were perfect), we wouldn't need any 
throttling. The disk scheduler should recognize that the kcopyd process is 
sending way more requests than any other process and should lower the 
i/o priority of kcopyd process.

In practice, the disk scheduler doesn't do it well, so kcopyd hurts the 
users. If you want an automated fix, fix the disk scheduler. But don't put 
disk scheduler logic into device mapper --- it dosn't belong there.

Mikulas

> - Joe

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-02 19:55   ` Mikulas Patocka
@ 2011-06-03 11:01     ` Joe Thornber
  2011-06-03 15:54       ` Mike Snitzer
  2011-06-07 17:50       ` Mikulas Patocka
  0 siblings, 2 replies; 23+ messages in thread
From: Joe Thornber @ 2011-06-03 11:01 UTC (permalink / raw)
  To: device-mapper development; +Cc: Alasdair G. Kergon

On Thu, Jun 02, 2011 at 03:55:16PM -0400, Mikulas Patocka wrote:
> > iv) you haven't explained how the sys admin works out the correct
> > throttle value.
> 
> There is no "correct" value. The "correct" value depends on how important 
> is copying itself v.s. other i/o.

So who is going to set this?  Do you really have no advice for them
beyond 'there is no correct value'?

> In theory (if disk scheduler were perfect), we wouldn't need any 
> throttling. The disk scheduler should recognize that the kcopyd process is 
> sending way more requests than any other process and should lower the 
> i/o priority of kcopyd process.
> 
> In practice, the disk scheduler doesn't do it well, so kcopyd hurts the 
> users. If you want an automated fix, fix the disk scheduler. But don't put 
> disk scheduler logic into device mapper --- it dosn't belong there.

I totally agree with these two paragraphs.  Any throttling you add to
kcopyd is always going to be a hack.

- Joe

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-03 11:01     ` Joe Thornber
@ 2011-06-03 15:54       ` Mike Snitzer
  2011-06-07 17:50       ` Mikulas Patocka
  1 sibling, 0 replies; 23+ messages in thread
From: Mike Snitzer @ 2011-06-03 15:54 UTC (permalink / raw)
  To: device-mapper development; +Cc: Vivek Goyal, Alasdair G. Kergon

On Fri, Jun 03 2011 at  7:01am -0400,
Joe Thornber <thornber@redhat.com> wrote:

> On Thu, Jun 02, 2011 at 03:55:16PM -0400, Mikulas Patocka wrote:
> > > iv) you haven't explained how the sys admin works out the correct
> > > throttle value.
> > 
> > There is no "correct" value. The "correct" value depends on how important 
> > is copying itself v.s. other i/o.
> 
> So who is going to set this?  Do you really have no advice for them
> beyond 'there is no correct value'?
> 
> > In theory (if disk scheduler were perfect), we wouldn't need any 
> > throttling. The disk scheduler should recognize that the kcopyd process is 
> > sending way more requests than any other process and should lower the 
> > i/o priority of kcopyd process.
> > 
> > In practice, the disk scheduler doesn't do it well, so kcopyd hurts the 
> > users. If you want an automated fix, fix the disk scheduler. But don't put 
> > disk scheduler logic into device mapper --- it dosn't belong there.
> 
> I totally agree with these two paragraphs.  Any throttling you add to
> kcopyd is always going to be a hack.

Wouldn't it be better to tie in to the block layer's new throttling
infrastructure that Vivek added?  Possibly expose a callback that
enables kcopyd consuming devices to throttle kcopyd as a side-effect of
the higher-level throttle?

Mike

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-03 11:01     ` Joe Thornber
  2011-06-03 15:54       ` Mike Snitzer
@ 2011-06-07 17:50       ` Mikulas Patocka
  2011-06-09  9:47         ` Joe Thornber
  1 sibling, 1 reply; 23+ messages in thread
From: Mikulas Patocka @ 2011-06-07 17:50 UTC (permalink / raw)
  To: device-mapper development, Joe Thornber; +Cc: Alasdair G. Kergon



On Fri, 3 Jun 2011, Joe Thornber wrote:

> On Thu, Jun 02, 2011 at 03:55:16PM -0400, Mikulas Patocka wrote:
> > > iv) you haven't explained how the sys admin works out the correct
> > > throttle value.
> > 
> > There is no "correct" value. The "correct" value depends on how important 
> > is copying itself v.s. other i/o.
> 
> So who is going to set this?  Do you really have no advice for them
> beyond 'there is no correct value'?

I did some tests. I did some random and sequential I/O on the same disk 
while mirror was resynchronizing. The result is this:

random read:
mirror throttle / IOs per second
100%	94
75%	113
50%	127
25%	145
idle	160

sequential read:
mirror throttle / throughput MB/s
100%	104
75%	118
50%	122
25%	127
idle	134

Mikulas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-07 17:50       ` Mikulas Patocka
@ 2011-06-09  9:47         ` Joe Thornber
  2011-06-09 16:08           ` Mikulas Patocka
  0 siblings, 1 reply; 23+ messages in thread
From: Joe Thornber @ 2011-06-09  9:47 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: device-mapper development, Alasdair G. Kergon

Mikulas,

On Tue, Jun 07, 2011 at 01:50:16PM -0400, Mikulas Patocka wrote:
> I did some tests. I did some random and sequential I/O on the same disk 
> while mirror was resynchronizing. The result is this:
> 
> random read:
> mirror throttle / IOs per second
> 100%	94
> 75%	113
> 50%	127
> 25%	145
> idle	160
> 
> sequential read:
> mirror throttle / throughput MB/s
> 100%	104
> 75%	118
> 50%	122
> 25%	127
> idle	134

This is nice to see, thanks for providing it.  I'm still uneasy about
this throttling being time based though.

What we're trying to do is avoid kcopyd issuing so much io that it
interferes with userland io.

Now you've already posted some nice patches that remove the fixed
sized buffers in kcopyd clients, instead grabbing extra memory if it's
available, otherwise making do with a tiny preallocated buffer to
ensure progress.  So as I see it, the amount of io that kcopyd submits
is dependent on the amount of memory available.

i) If there is lots of memory available can your throttling patch
still manage to issue too much io in the time that kcopyd is active?

ii) If there is little memory available few ios will be issued.  But
your throttling will still occur, slowing things down even more.

I think it makes much more sense to throttle based on amount of io
issued by kcopyd.  Either tracking throughput, or even just putting a
limit on the amount of io that can be in flight at any one time.

- Joe

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-09  9:47         ` Joe Thornber
@ 2011-06-09 16:08           ` Mikulas Patocka
  2011-06-09 16:27             ` Alasdair G Kergon
  2011-06-10  8:44             ` Joe Thornber
  0 siblings, 2 replies; 23+ messages in thread
From: Mikulas Patocka @ 2011-06-09 16:08 UTC (permalink / raw)
  To: Joe Thornber; +Cc: device-mapper development, Alasdair G. Kergon



On Thu, 9 Jun 2011, Joe Thornber wrote:

> Mikulas,
> 
> On Tue, Jun 07, 2011 at 01:50:16PM -0400, Mikulas Patocka wrote:
> > I did some tests. I did some random and sequential I/O on the same disk 
> > while mirror was resynchronizing. The result is this:
> > 
> > random read:
> > mirror throttle / IOs per second
> > 100%	94
> > 75%	113
> > 50%	127
> > 25%	145
> > idle	160
> > 
> > sequential read:
> > mirror throttle / throughput MB/s
> > 100%	104
> > 75%	118
> > 50%	122
> > 25%	127
> > idle	134
> 
> This is nice to see, thanks for providing it.  I'm still uneasy about
> this throttling being time based though.
> 
> What we're trying to do is avoid kcopyd issuing so much io that it
> interferes with userland io.

But you don't know if there is some userland IO or not to the same disk.

> Now you've already posted some nice patches that remove the fixed
> sized buffers in kcopyd clients, instead grabbing extra memory if it's
> available, otherwise making do with a tiny preallocated buffer to
> ensure progress.  So as I see it, the amount of io that kcopyd submits
> is dependent on the amount of memory available.
> 
> i) If there is lots of memory available can your throttling patch
> still manage to issue too much io in the time that kcopyd is active?

It issues as much IO as it can in the active period.

> ii) If there is little memory available few ios will be issued.  But
> your throttling will still occur, slowing things down even more.

Yes. Memory pressure and throttiling are independent things.

> I think it makes much more sense to throttle based on amount of io
> issued by kcopyd.  Either tracking throughput,

You don't know what is the throughput of the device. So throttling to 
something like "50% throughput" can't be done.

Measuring device throughput is unreliable. For example, if there is 
concurrent IO to the same device, your observed throughput is lower than 
the actual device throughput. But you never know that there is a 
concurrent IO, so you can't find that the measuring is skewed.

Also, throughput depends on the disk region you are accessing --- the same 
disk has higher throughput in the beginning than in the end.

Also, one dm-raid leg can span multiple disks, and they can have different 
throughput.

You can't reliably determine the device throughput, so you can't reliably 
throttle based throughput.

> or even just putting a
> limit on the amount of io that can be in flight at any one time.

Which is much less reliable throttling than time slots.
the resync spee is:
8 sub jobs --- 76MB/s
2 sub jobs --- 74MB/s
1 sub job --- 65MB/s

> - Joe

Mikulas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-09 16:08           ` Mikulas Patocka
@ 2011-06-09 16:27             ` Alasdair G Kergon
  2011-06-10  8:44             ` Joe Thornber
  1 sibling, 0 replies; 23+ messages in thread
From: Alasdair G Kergon @ 2011-06-09 16:27 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: device-mapper development, Joe Thornber

On Thu, Jun 09, 2011 at 12:08:08PM -0400, Mikulas Patocka wrote:
> 8 sub jobs --- 76MB/s
> 2 sub jobs --- 74MB/s

Interesting data point: In your set up, the extra 6 sub jobs bring
little benefit.

Alasdair

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-09 16:08           ` Mikulas Patocka
  2011-06-09 16:27             ` Alasdair G Kergon
@ 2011-06-10  8:44             ` Joe Thornber
  2011-06-10  9:28               ` Lars Ellenberg
  2011-06-11 20:27               ` Mikulas Patocka
  1 sibling, 2 replies; 23+ messages in thread
From: Joe Thornber @ 2011-06-10  8:44 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: device-mapper development, Alasdair G. Kergon

On Thu, Jun 09, 2011 at 12:08:08PM -0400, Mikulas Patocka wrote:
> 
> 
> On Thu, 9 Jun 2011, Joe Thornber wrote:
> > What we're trying to do is avoid kcopyd issuing so much io that it
> > interferes with userland io.
> 
> But you don't know if there is some userland IO or not to the same disk.

None the less, this was the motivation Alasdair gave for wanting this
throttling.

> > i) If there is lots of memory available can your throttling patch
> > still manage to issue too much io in the time that kcopyd is active?
> 
> It issues as much IO as it can in the active period.

Exactly, it can issue too much.

> > ii) If there is little memory available few ios will be issued.  But
> > your throttling will still occur, slowing things down even more.
> 
> Yes. Memory pressure and throttiling are independent things.

True, but if kcopyd has only managed to submit 50k of io in it's last
timeslice why on earth would you decide to put it to sleep rather than
try and issue some more?  I don't believe your time based throttling
behaves the same under different memory pressure situations.  So the
sys admin could set up your throttle parameters under one set of
conditions.  Then these conditions could change and result in either
too much or too little throttling.

> 
> > I think it makes much more sense to throttle based on amount of io
> > issued by kcopyd.  Either tracking throughput,
> 
> You don't know what is the throughput of the device. So throttling to 
> something like "50% throughput" can't be done.

I agree we don't know what the throughput on the devices is.  What I
meant was to throttle the volume of io that kcopyd generates against
an absolute value.  eg.  "The mirror kcopyd client cannot submit more
than 100M of io per second."  So you don't have to measure and
calculate any theoretical maximum throughput and calculate percentages
off that.

> > or even just putting a
> > limit on the amount of io that can be in flight at any one time.
> 
> Which is much less reliable throttling than time slots.
> the resync spee is:
> 8 sub jobs --- 76MB/s
> 2 sub jobs --- 74MB/s
> 1 sub job --- 65MB/s

I really don't understand these figures.  Why doesn't it scale
linearly with the number of sub jobs?  Are the sub jobs all the same
size in these cases?  Is this with your throttling?  Are the sub jobs
so large that memory pressure is imposing a max limit of in flight io?

- Joe

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-10  8:44             ` Joe Thornber
@ 2011-06-10  9:28               ` Lars Ellenberg
  2011-06-10 10:14                 ` Joe Thornber
  2011-06-11 20:27               ` Mikulas Patocka
  1 sibling, 1 reply; 23+ messages in thread
From: Lars Ellenberg @ 2011-06-10  9:28 UTC (permalink / raw)
  To: device-mapper development; +Cc: Mikulas Patocka, Alasdair G. Kergon

On Fri, Jun 10, 2011 at 09:44:25AM +0100, Joe Thornber wrote:
> On Thu, Jun 09, 2011 at 12:08:08PM -0400, Mikulas Patocka wrote:
> > 
> > 
> > On Thu, 9 Jun 2011, Joe Thornber wrote:
> > > What we're trying to do is avoid kcopyd issuing so much io that it
> > > interferes with userland io.
> > 
> > But you don't know if there is some userland IO or not to the same disk.
> 
> None the less, this was the motivation Alasdair gave for wanting this
> throttling.

Not sure if it helps,
but are you familiar with the MD raid rebuild throttling?

see is_mddev_idle() in drivers/md/md.c

It basically looks at read/write sector part_stats, and tracks its
own resync IO issued to the disk in question (in the special
gendisk member sync_io -- you'd probably need to use your own
counter in some other kcopyd context struct).

Then it figures, if there is "significant" increase in the
partition stats that cannot be accounted for by its own resync IO,
that should be userland IO, which means the device is not idle.

Then it has minimum and maximum bandwidth limits.
It won't use up more than maximum bandwidth,
even if the disk(s) seem to be idle.

If the disk seems to be NOT idle, it still would only throttle,
if its current estimated resync bandwidth is above the minimum
bandwidth limit.

Maybe a similar algorithm could be used for kcopyd?


	Lars

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-10  9:28               ` Lars Ellenberg
@ 2011-06-10 10:14                 ` Joe Thornber
  2011-06-10 13:41                   ` Mikulas Patocka
  0 siblings, 1 reply; 23+ messages in thread
From: Joe Thornber @ 2011-06-10 10:14 UTC (permalink / raw)
  To: device-mapper development; +Cc: Mikulas Patocka, Alasdair G. Kergon

On Fri, Jun 10, 2011 at 11:28:34AM +0200, Lars Ellenberg wrote:
> On Fri, Jun 10, 2011 at 09:44:25AM +0100, Joe Thornber wrote:
> > On Thu, Jun 09, 2011 at 12:08:08PM -0400, Mikulas Patocka wrote:
> > > 
> > > 
> > > On Thu, 9 Jun 2011, Joe Thornber wrote:
> > > > What we're trying to do is avoid kcopyd issuing so much io that it
> > > > interferes with userland io.
> > > 
> > > But you don't know if there is some userland IO or not to the same disk.
> > 
> > None the less, this was the motivation Alasdair gave for wanting this
> > throttling.
> 
> Not sure if it helps,
> but are you familiar with the MD raid rebuild throttling?

Lars,

This is very helpful, thankyou.  Any thoughts on this Mikulas?

- Joe

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-10 10:14                 ` Joe Thornber
@ 2011-06-10 13:41                   ` Mikulas Patocka
  2011-06-10 13:48                     ` Joe Thornber
  2011-06-10 13:51                     ` Mike Snitzer
  0 siblings, 2 replies; 23+ messages in thread
From: Mikulas Patocka @ 2011-06-10 13:41 UTC (permalink / raw)
  To: Joe Thornber; +Cc: device-mapper development, Alasdair G. Kergon



On Fri, 10 Jun 2011, Joe Thornber wrote:

> On Fri, Jun 10, 2011 at 11:28:34AM +0200, Lars Ellenberg wrote:
> > On Fri, Jun 10, 2011 at 09:44:25AM +0100, Joe Thornber wrote:
> > > On Thu, Jun 09, 2011 at 12:08:08PM -0400, Mikulas Patocka wrote:
> > > > 
> > > > 
> > > > On Thu, 9 Jun 2011, Joe Thornber wrote:
> > > > > What we're trying to do is avoid kcopyd issuing so much io that it
> > > > > interferes with userland io.
> > > > 
> > > > But you don't know if there is some userland IO or not to the same disk.
> > > 
> > > None the less, this was the motivation Alasdair gave for wanting this
> > > throttling.
> > 
> > Not sure if it helps,
> > but are you familiar with the MD raid rebuild throttling?
> 
> Lars,
> 
> This is very helpful, thankyou.  Any thoughts on this Mikulas?
> 
> - Joe

This works if the data is directly on the partition, but it won't work on 
the device mapper (if MD is on the device mapper, it won't work too).

The device mapper has no function to tell where the bio is finally 
remapped, so you can't read disk statistics for that particular disk.

Mikulas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-10 13:41                   ` Mikulas Patocka
@ 2011-06-10 13:48                     ` Joe Thornber
  2011-06-10 16:13                       ` Lars Ellenberg
  2011-06-10 13:51                     ` Mike Snitzer
  1 sibling, 1 reply; 23+ messages in thread
From: Joe Thornber @ 2011-06-10 13:48 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: device-mapper development, Alasdair G. Kergon

On Fri, Jun 10, 2011 at 09:41:50AM -0400, Mikulas Patocka wrote:
> This works if the data is directly on the partition, but it won't work on 
> the device mapper (if MD is on the device mapper, it won't work too).
> 
> The device mapper has no function to tell where the bio is finally 
> remapped, so you can't read disk statistics for that particular disk.

ok, thx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-10 13:41                   ` Mikulas Patocka
  2011-06-10 13:48                     ` Joe Thornber
@ 2011-06-10 13:51                     ` Mike Snitzer
  1 sibling, 0 replies; 23+ messages in thread
From: Mike Snitzer @ 2011-06-10 13:51 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: device-mapper development, Joe Thornber, Alasdair G. Kergon

On Fri, Jun 10 2011 at  9:41am -0400,
Mikulas Patocka <mpatocka@redhat.com> wrote:

> 
> 
> On Fri, 10 Jun 2011, Joe Thornber wrote:
> 
> > On Fri, Jun 10, 2011 at 11:28:34AM +0200, Lars Ellenberg wrote:
> > > On Fri, Jun 10, 2011 at 09:44:25AM +0100, Joe Thornber wrote:
> > > > On Thu, Jun 09, 2011 at 12:08:08PM -0400, Mikulas Patocka wrote:
> > > > > 
> > > > > 
> > > > > On Thu, 9 Jun 2011, Joe Thornber wrote:
> > > > > > What we're trying to do is avoid kcopyd issuing so much io that it
> > > > > > interferes with userland io.
> > > > > 
> > > > > But you don't know if there is some userland IO or not to the same disk.
> > > > 
> > > > None the less, this was the motivation Alasdair gave for wanting this
> > > > throttling.
> > > 
> > > Not sure if it helps,
> > > but are you familiar with the MD raid rebuild throttling?
> > 
> > Lars,
> > 
> > This is very helpful, thankyou.  Any thoughts on this Mikulas?
> > 
> > - Joe
> 
> This works if the data is directly on the partition, but it won't work on 
> the device mapper (if MD is on the device mapper, it won't work too).
> 
> The device mapper has no function to tell where the bio is finally 
> remapped, so you can't read disk statistics for that particular disk.

DM raid in terms of MD (as Jon and Neil are working on) has functional
resync throttling.  So DM wrapping MD raid gives us what we need without
the need to change kcopyd no?

Mike

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-10 13:48                     ` Joe Thornber
@ 2011-06-10 16:13                       ` Lars Ellenberg
  0 siblings, 0 replies; 23+ messages in thread
From: Lars Ellenberg @ 2011-06-10 16:13 UTC (permalink / raw)
  To: Joe Thornber
  Cc: device-mapper development, Mikulas Patocka, Alasdair G. Kergon

On Fri, Jun 10, 2011 at 02:48:00PM +0100, Joe Thornber wrote:
> On Fri, Jun 10, 2011 at 09:41:50AM -0400, Mikulas Patocka wrote:
> > This works if the data is directly on the partition, but it won't work on 
> > the device mapper (if MD is on the device mapper, it won't work too).
> > 
> > The device mapper has no function to tell where the bio is finally 
> > remapped, so you can't read disk statistics for that particular disk.
> 
> ok, thx

Right, you do not know which phyical device the IO may end up on,
if it passes through several stacking layers.

But even if that would be desirable in the long term,
it may not be necessary right now? [*]

Jobs on one mapping would not notice that the same physical device may
be busy via some other mapping.

But at least it would notice user (ok, non-kcopyd) IO on the same source
or destination mapping(s), which is a big improvement already, compared
with completely ignoring what is going on?

struct kcopyd_job {
   ...
   struct dm_io_region source;
   struct dm_io_region dests[...];
   ...
}

struct dm_io_region {
   struct block_device *bdev;
   ...
}

struct block_device {
   ...
   struct gendisk *bd_disk;
   ...
}

struct gendisk {
   ...
   struct hd_struct part0;   <--- there live the stats.
   ...
}

[*]
/me wildly handwaving
That could be solved, if we want to.
Add to block_device_operations:
   struct gendisk * (*get_lower_level_gendisk)(struct gendisk*);


struct gendisk *appropriate_function_name(struct gendisk *d)
{
	while (d->fops->get_lower_level_gendisk)
		d = d->fops->get_lower_level_gendisk(d);
	return d;
}

Then have some convenience functions part_stat_read_lowest_level(...).

May have come up before, probably was rejected because of uglyness?

  ;-)


	Lars

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-10  8:44             ` Joe Thornber
  2011-06-10  9:28               ` Lars Ellenberg
@ 2011-06-11 20:27               ` Mikulas Patocka
  2011-06-13  9:17                 ` Joe Thornber
  1 sibling, 1 reply; 23+ messages in thread
From: Mikulas Patocka @ 2011-06-11 20:27 UTC (permalink / raw)
  To: Joe Thornber; +Cc: device-mapper development, Alasdair G. Kergon



On Fri, 10 Jun 2011, Joe Thornber wrote:

> On Thu, Jun 09, 2011 at 12:08:08PM -0400, Mikulas Patocka wrote:
> > 
> > 
> > On Thu, 9 Jun 2011, Joe Thornber wrote:
> > > What we're trying to do is avoid kcopyd issuing so much io that it
> > > interferes with userland io.
> > 
> > But you don't know if there is some userland IO or not to the same disk.
> 
> None the less, this was the motivation Alasdair gave for wanting this
> throttling.
> 
> > > i) If there is lots of memory available can your throttling patch
> > > still manage to issue too much io in the time that kcopyd is active?
> > 
> > It issues as much IO as it can in the active period.
> 
> Exactly, it can issue too much.

As shown in the numbers below, submitting one IO reduces throughput from 
76MB/s to 65MB/s. Reducing number of IOs doesn't help much.

> > > ii) If there is little memory available few ios will be issued.  But
> > > your throttling will still occur, slowing things down even more.
> > 
> > Yes. Memory pressure and throttiling are independent things.
> 
> True, but if kcopyd has only managed to submit 50k of io in it's last
> timeslice why on earth would you decide to put it to sleep rather than
> try and issue some more?

Because the issues of memory pressure and throttling are idependent.

> I don't believe your time based throttling
> behaves the same under different memory pressure situations.  So the
> sys admin could set up your throttle parameters under one set of
> conditions.  Then these conditions could change and result in either
> too much or too little throttling.

I tried it and it thottled with one sub job as well as with many sub jobs.

I think you are tackling multiple independent things here. In order to 
have understandable code and understandable behaviour, the throttle code 
must manage *JUST*ONE*THING* (the time when i/os are submitted, in this 
case). If you start making exceptions such as "what if there is a memory 
pressure" or "apart from restricting the time, send less i/o", then the 
logic behind the code becomes unintelligible.

It is much easier to explain to the users "if you set X value in 
/sys/module/dm_mirror/parameters/raid1_resync_throttle, then the copying 
will be done in X% of time, leaving the disk idle 100-X% of time", then to 
invent some magic mechanisms that change multiple things based on X and 
other conditions.

> > > I think it makes much more sense to throttle based on amount of io
> > > issued by kcopyd.  Either tracking throughput,
> > 
> > You don't know what is the throughput of the device. So throttling to 
> > something like "50% throughput" can't be done.
> 
> I agree we don't know what the throughput on the devices is.  What I
> meant was to throttle the volume of io that kcopyd generates against
> an absolute value.  eg.  "The mirror kcopyd client cannot submit more
> than 100M of io per second."  So you don't have to measure and
> calculate any theoretical maximum throughput and calculate percentages
> off that.

And what if the beginning of the disk has 130MB/s and the end 80MB/s? Then 
throughput-based throttle would behave differently based on the part of 
the disk that is being resynchronized.

> > > or even just putting a
> > > limit on the amount of io that can be in flight at any one time.
> > 
> > Which is much less reliable throttling than time slots.
> > the resync spee is:
> > 8 sub jobs --- 76MB/s
> > 2 sub jobs --- 74MB/s
> > 1 sub job --- 65MB/s
> 
> I really don't understand these figures.  Why doesn't it scale
> linearly with the number of sub jobs?  Are the sub jobs all the same
> size in these cases?  Is this with your throttling?  Are the sub jobs
> so large that memory pressure is imposing a max limit of in flight io?

It is without throttling. It doesn't scale linearly because the disk has 
just one mechanical arm with all the heads, so it processes all jobs 
one-by-one anyway.

If the disk had multiple arms, it would scale linearly.

> - Joe

Mikulas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-11 20:27               ` Mikulas Patocka
@ 2011-06-13  9:17                 ` Joe Thornber
  2011-06-13 21:06                   ` Mikulas Patocka
  0 siblings, 1 reply; 23+ messages in thread
From: Joe Thornber @ 2011-06-13  9:17 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: device-mapper development, Alasdair G. Kergon

On Sat, Jun 11, 2011 at 04:27:02PM -0400, Mikulas Patocka wrote:
> It is much easier to explain to the users "if you set X value in 
> /sys/module/dm_mirror/parameters/raid1_resync_throttle, then the copying 
> will be done in X% of time, leaving the disk idle 100-X% of time", then to 
> invent some magic mechanisms that change multiple things based on X and 
> other conditions.

Yes, that is very easy to explain to the users.  This sort of
description is what I was after when I kept asking you to tell me how
to set it.

It's not clear to me that setting the throttle to 80% will copy at 80%
of the speed, or leave the disk idle for 20% of the time.  Perhaps you
have some benchmarks to back this up?  (with different memory pressure
situations please).

Restricting the cpu available to issue io is not a great way to
throttle io.  There will be plenty of situations where 20% of the cpu
is enough to swamp the devices.

- Joe

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-13  9:17                 ` Joe Thornber
@ 2011-06-13 21:06                   ` Mikulas Patocka
  2011-06-14  8:34                     ` Joe Thornber
  0 siblings, 1 reply; 23+ messages in thread
From: Mikulas Patocka @ 2011-06-13 21:06 UTC (permalink / raw)
  To: Joe Thornber; +Cc: device-mapper development, Alasdair G. Kergon



On Mon, 13 Jun 2011, Joe Thornber wrote:

> On Sat, Jun 11, 2011 at 04:27:02PM -0400, Mikulas Patocka wrote:
> > It is much easier to explain to the users "if you set X value in 
> > /sys/module/dm_mirror/parameters/raid1_resync_throttle, then the copying 
> > will be done in X% of time, leaving the disk idle 100-X% of time", then to 
> > invent some magic mechanisms that change multiple things based on X and 
> > other conditions.
> 
> Yes, that is very easy to explain to the users.  This sort of
> description is what I was after when I kept asking you to tell me how
> to set it.
> 
> It's not clear to me that setting the throttle to 80% will copy at 80%
> of the speed, or leave the disk idle for 20% of the time.  Perhaps you
> have some benchmarks to back this up?  (with different memory pressure
> situations please).

See below. It is not exatly linear, but it is approximately linear well 
enough.

> Restricting the cpu available to issue io is not a great way to
> throttle io.  There will be plenty of situations where 20% of the cpu
> is enough to swamp the devices.
>
> - Joe

You misunderstand the patch. The patch doesn't restrict cpu time. The 
patch measures global time and time when one or more i/os was in flight 
(it doesn't count the number of i/os in flight). Every second, both these 
variables are divided by 2 (to provide some decay). The proportion of 
these two variables is the percentage of actual disk activity. When this 
proportion is greater than the percentage set by the user, we sleep.



Resync speed is this:

throttle | resync speed MB/s
100%	75
90%	66
80%	60
70%	49
60%	44
50%	33
40%	25
30%	17
20%	10
10%	5
0%	0.26

If I simulate memory pressure (i.e. don't allow allocations and process 
just one sub job), the speed is this:

100%	70
90%	62
80%	55
70%	47
60%	37
50%	29
40%	25
30%	16
20%	9
10%	2
0%	0.12

Mikulas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure
  2011-06-13 21:06                   ` Mikulas Patocka
@ 2011-06-14  8:34                     ` Joe Thornber
  0 siblings, 0 replies; 23+ messages in thread
From: Joe Thornber @ 2011-06-14  8:34 UTC (permalink / raw)
  To: Mikulas Patocka; +Cc: device-mapper development, Alasdair G. Kergon

On Mon, Jun 13, 2011 at 05:06:29PM -0400, Mikulas Patocka wrote:
> See below. It is not exatly linear, but it is approximately linear well 
> enough.

Thanks Mikulas, these results are very nice.  I'll merge this set into
my tree and give it a go.

- Joe

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2011-06-14  8:34 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-31 22:03 [PATCH 1/2] dm-kcopyd: introduce per-module throttle structure Mikulas Patocka
2011-06-01  6:18 ` Ankit Jain
2011-06-01  7:13   ` Ankit Jain
2011-06-02 19:16   ` Mikulas Patocka
2011-06-01  9:51 ` Joe Thornber
2011-06-02 19:55   ` Mikulas Patocka
2011-06-03 11:01     ` Joe Thornber
2011-06-03 15:54       ` Mike Snitzer
2011-06-07 17:50       ` Mikulas Patocka
2011-06-09  9:47         ` Joe Thornber
2011-06-09 16:08           ` Mikulas Patocka
2011-06-09 16:27             ` Alasdair G Kergon
2011-06-10  8:44             ` Joe Thornber
2011-06-10  9:28               ` Lars Ellenberg
2011-06-10 10:14                 ` Joe Thornber
2011-06-10 13:41                   ` Mikulas Patocka
2011-06-10 13:48                     ` Joe Thornber
2011-06-10 16:13                       ` Lars Ellenberg
2011-06-10 13:51                     ` Mike Snitzer
2011-06-11 20:27               ` Mikulas Patocka
2011-06-13  9:17                 ` Joe Thornber
2011-06-13 21:06                   ` Mikulas Patocka
2011-06-14  8:34                     ` Joe Thornber

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.