From: Vivek Goyal <vgoyal@redhat.com> To: linux-kernel@vger.kernel.org, containers@lists.linux-foundation.org, dm-devel@redhat.com, jens.axboe@oracle.com, nauman@google.com, dpshah@google.com, lizf@cn.fujitsu.com, mikew@google.com, fchecconi@gmail.com, paolo.valente@unimore.it, ryov@valinux.co.jp, fernando@oss.ntt.co.jp, s-uchida@ap.jp.nec.com, taka@valinux.co.jp, guijianfeng@cn.fujitsu.com, jmoyer@redhat.com, dhaval@linux.vnet.ibm.com, balbir@linux.vnet.ibm.com, righi.andrea@gmail.com, m-ikeda@ds.jp.nec.com, jbaron@redhat.com Cc: agk@redhat.com, snitzer@redhat.com, vgoyal@redhat.com, akpm@linux-foundation.org, peterz@infradead.org Subject: [PATCH 23/25] io-controller: Support per cgroup per device weights and io class Date: Thu, 2 Jul 2009 16:01:55 -0400 [thread overview] Message-ID: <1246564917-19603-24-git-send-email-vgoyal@redhat.com> (raw) In-Reply-To: <1246564917-19603-1-git-send-email-vgoyal@redhat.com> This patch enables per-cgroup per-device weight and ioprio_class handling. A new cgroup interface "policy" is introduced. You can make use of this file to configure weight and ioprio_class for each device in a given cgroup. The original "weight" and "ioprio_class" files are still available. If you don't do special configuration for a particular device, "weight" and "ioprio_class" are used as default values in this device. You can use the following format to play with the new interface. #echo dev_major:dev_minor weight ioprio_class > /patch/to/cgroup/policy weight=0 means removing the policy for device. Examples: Configure weight=300 ioprio_class=2 on /dev/hdb (8:16) in this cgroup # echo "8:16 300 2" > io.policy # cat io.policy dev weight class 8:16 300 2 Configure weight=500 ioprio_class=1 on /dev/hda (8:0) in this cgroup # echo "8:0 500 1" > io.policy # cat io.policy dev weight class 8:0 500 1 8:16 300 2 Remove the policy for /dev/hda in this cgroup # echo 8:0 0 1 > io.policy # cat io.policy dev weight class 8:16 300 2 Changelog (v1 -> v2) - Rename some structures - Use spin_lock_irqsave() and spin_lock_irqrestore() version to prevent from enabling the interrupts unconditionally. - Fix policy setup bug when switching to another io scheduler. - If a policy is available for a specific device, don't update weight and io class when writing "weight" and "iprio_class". - Fix a bug when parsing policy string. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> --- block/elevator-fq.c | 266 ++++++++++++++++++++++++++++++++++++++++++++++++++- block/elevator-fq.h | 10 ++ 2 files changed, 272 insertions(+), 4 deletions(-) diff --git a/block/elevator-fq.c b/block/elevator-fq.c index 2a2b68d..31b066d 100644 --- a/block/elevator-fq.c +++ b/block/elevator-fq.c @@ -17,6 +17,7 @@ #include <linux/blktrace_api.h> #include <linux/seq_file.h> #include <linux/biotrack.h> +#include <linux/genhd.h> /* Values taken from cfq */ const int elv_slice_sync = HZ / 10; @@ -1053,12 +1054,31 @@ static void bfq_init_entity(struct io_entity *entity, struct io_group *iog) entity->sched_data = &iog->sched_data; } -static void io_group_init_entity(struct io_cgroup *iocg, struct io_group *iog) +static struct io_policy_node *policy_search_node(const struct io_cgroup *iocg, + dev_t dev); + +static void io_group_init_entity(struct io_cgroup *iocg, struct io_group *iog, + dev_t dev) { struct io_entity *entity = &iog->entity; + struct io_policy_node *pn; + unsigned long flags; + + spin_lock_irqsave(&iocg->lock, flags); + pn = policy_search_node(iocg, dev); + if (pn) { + entity->weight = pn->weight; + entity->new_weight = pn->weight; + entity->ioprio_class = pn->ioprio_class; + entity->new_ioprio_class = pn->ioprio_class; + } else { + entity->weight = iocg->weight; + entity->new_weight = iocg->weight; + entity->ioprio_class = iocg->ioprio_class; + entity->new_ioprio_class = iocg->ioprio_class; + } + spin_unlock_irqrestore(&iocg->lock, flags); - entity->weight = entity->new_weight = iocg->weight; - entity->ioprio_class = entity->new_ioprio_class = iocg->ioprio_class; entity->ioprio_changed = 1; entity->my_sched_data = &iog->sched_data; } @@ -1174,6 +1194,227 @@ io_cgroup_lookup_group(struct io_cgroup *iocg, void *key) return NULL; } +static int io_cgroup_policy_read(struct cgroup *cgrp, struct cftype *cft, + struct seq_file *m) +{ + struct io_cgroup *iocg; + struct io_policy_node *pn; + + iocg = cgroup_to_io_cgroup(cgrp); + + if (list_empty(&iocg->policy_list)) + goto out; + + seq_printf(m, "dev\tweight\tclass\n"); + + spin_lock_irq(&iocg->lock); + list_for_each_entry(pn, &iocg->policy_list, node) { + seq_printf(m, "%u:%u\t%u\t%hu\n", MAJOR(pn->dev), + MINOR(pn->dev), pn->weight, pn->ioprio_class); + } + spin_unlock_irq(&iocg->lock); +out: + return 0; +} + +static inline void policy_insert_node(struct io_cgroup *iocg, + struct io_policy_node *pn) +{ + list_add(&pn->node, &iocg->policy_list); +} + +/* Must be called with iocg->lock held */ +static inline void policy_delete_node(struct io_policy_node *pn) +{ + list_del(&pn->node); +} + +/* Must be called with iocg->lock held */ +static struct io_policy_node *policy_search_node(const struct io_cgroup *iocg, + dev_t dev) +{ + struct io_policy_node *pn; + + if (list_empty(&iocg->policy_list)) + return NULL; + + list_for_each_entry(pn, &iocg->policy_list, node) { + if (pn->dev == dev) + return pn; + } + + return NULL; +} + +static int check_dev_num(dev_t dev) +{ + int part = 0; + struct gendisk *disk; + + disk = get_gendisk(dev, &part); + if (!disk || part) + return -ENODEV; + + return 0; +} + +static int policy_parse_and_set(char *buf, struct io_policy_node *newpn) +{ + char *s[4], *p, *major_s = NULL, *minor_s = NULL; + int ret; + unsigned long major, minor, temp; + int i = 0; + dev_t dev; + + memset(s, 0, sizeof(s)); + while ((p = strsep(&buf, " ")) != NULL) { + if (!*p) + continue; + s[i++] = p; + + /* Prevent from inputing too many things */ + if (i == 4) + break; + } + + if (i != 3) + return -EINVAL; + + p = strsep(&s[0], ":"); + if (p != NULL) + major_s = p; + else + return -EINVAL; + + minor_s = s[0]; + if (!minor_s) + return -EINVAL; + + ret = strict_strtoul(major_s, 10, &major); + if (ret) + return -EINVAL; + + ret = strict_strtoul(minor_s, 10, &minor); + if (ret) + return -EINVAL; + + dev = MKDEV(major, minor); + + ret = check_dev_num(dev); + if (ret) + return ret; + + newpn->dev = dev; + + if (s[1] == NULL) + return -EINVAL; + + ret = strict_strtoul(s[1], 10, &temp); + if (ret || temp > WEIGHT_MAX) + return -EINVAL; + + newpn->weight = temp; + + if (s[2] == NULL) + return -EINVAL; + + ret = strict_strtoul(s[2], 10, &temp); + if (ret || temp < IOPRIO_CLASS_RT || temp > IOPRIO_CLASS_IDLE) + return -EINVAL; + newpn->ioprio_class = temp; + + return 0; +} + +static int io_cgroup_policy_write(struct cgroup *cgrp, struct cftype *cft, + const char *buffer) +{ + struct io_cgroup *iocg; + struct io_policy_node *newpn, *pn; + char *buf; + int ret = 0; + int keep_newpn = 0; + struct hlist_node *n; + struct io_group *iog; + + buf = kstrdup(buffer, GFP_KERNEL); + if (!buf) + return -ENOMEM; + + newpn = kzalloc(sizeof(*newpn), GFP_KERNEL); + if (!newpn) { + ret = -ENOMEM; + goto free_buf; + } + + ret = policy_parse_and_set(buf, newpn); + if (ret) + goto free_newpn; + + if (!cgroup_lock_live_group(cgrp)) { + ret = -ENODEV; + goto free_newpn; + } + + iocg = cgroup_to_io_cgroup(cgrp); + spin_lock_irq(&iocg->lock); + + pn = policy_search_node(iocg, newpn->dev); + if (!pn) { + if (newpn->weight != 0) { + policy_insert_node(iocg, newpn); + keep_newpn = 1; + } + goto update_io_group; + } + + if (newpn->weight == 0) { + /* weight == 0 means deleteing a policy */ + policy_delete_node(pn); + goto update_io_group; + } + + pn->weight = newpn->weight; + pn->ioprio_class = newpn->ioprio_class; + +update_io_group: + hlist_for_each_entry(iog, n, &iocg->group_data, group_node) { + if (iog->dev == newpn->dev) { + if (newpn->weight) { + iog->entity.new_weight = newpn->weight; + iog->entity.new_ioprio_class = + newpn->ioprio_class; + /* + * iog weight and ioprio_class updating + * actually happens if ioprio_changed is set. + * So ensure ioprio_changed is not set until + * new weight and new ioprio_class are updated. + */ + smp_wmb(); + iog->entity.ioprio_changed = 1; + } else { + iog->entity.new_weight = iocg->weight; + iog->entity.new_ioprio_class = + iocg->ioprio_class; + + /* The same as above */ + smp_wmb(); + iog->entity.ioprio_changed = 1; + } + } + } + spin_unlock_irq(&iocg->lock); + + cgroup_unlock(); + +free_newpn: + if (!keep_newpn) + kfree(newpn); +free_buf: + kfree(buf); + return ret; +} + #define SHOW_FUNCTION(__VAR) \ static u64 io_cgroup_##__VAR##_read(struct cgroup *cgroup, \ struct cftype *cftype) \ @@ -1206,6 +1447,7 @@ static int io_cgroup_##__VAR##_write(struct cgroup *cgroup, \ struct io_cgroup *iocg; \ struct io_group *iog; \ struct hlist_node *n; \ + struct io_policy_node *pn; \ \ if (val < (__MIN) || val > (__MAX)) \ return -EINVAL; \ @@ -1218,6 +1460,9 @@ static int io_cgroup_##__VAR##_write(struct cgroup *cgroup, \ spin_lock_irq(&iocg->lock); \ iocg->__VAR = (unsigned long)val; \ hlist_for_each_entry(iog, n, &iocg->group_data, group_node) { \ + pn = policy_search_node(iocg, iog->dev); \ + if (pn) \ + continue; \ iog->entity.new_##__VAR = (unsigned long)val; \ smp_wmb(); \ iog->entity.ioprio_changed = 1; \ @@ -1295,6 +1540,12 @@ static int io_cgroup_disk_sectors_read(struct cgroup *cgroup, struct cftype bfqio_files[] = { { + .name = "policy", + .read_seq_string = io_cgroup_policy_read, + .write_string = io_cgroup_policy_write, + .max_write_len = 256, + }, + { .name = "weight", .read_u64 = io_cgroup_weight_read, .write_u64 = io_cgroup_weight_write, @@ -1336,6 +1587,7 @@ static struct cgroup_subsys_state *iocg_create(struct cgroup_subsys *subsys, INIT_HLIST_HEAD(&iocg->group_data); iocg->weight = IO_DEFAULT_GRP_WEIGHT; iocg->ioprio_class = IO_DEFAULT_GRP_CLASS; + INIT_LIST_HEAD(&iocg->policy_list); return &iocg->css; } @@ -1438,7 +1690,7 @@ io_group_chain_alloc(struct request_queue *q, void *key, struct cgroup *cgroup) sscanf(dev_name(bdi->dev), "%u:%u", &major, &minor); iog->dev = MKDEV(major, minor); - io_group_init_entity(iocg, iog); + io_group_init_entity(iocg, iog, iog->dev); iog->my_entity = &iog->entity; atomic_set(&iog->ref, 0); @@ -1904,6 +2156,7 @@ static void iocg_destroy(struct cgroup_subsys *subsys, struct cgroup *cgroup) struct io_group *iog; struct elv_fq_data *efqd; unsigned long uninitialized_var(flags); + struct io_policy_node *pn, *pntmp; /* * io groups are linked in two lists. One list is maintained @@ -1943,6 +2196,11 @@ remove_entry: goto remove_entry; done: + list_for_each_entry_safe(pn, pntmp, &iocg->policy_list, node) { + policy_delete_node(pn); + kfree(pn); + } + free_css_id(&io_subsys, &iocg->css); rcu_read_unlock(); BUG_ON(!hlist_empty(&iocg->group_data)); diff --git a/block/elevator-fq.h b/block/elevator-fq.h index 214fb61..58c650b 100644 --- a/block/elevator-fq.h +++ b/block/elevator-fq.h @@ -267,6 +267,13 @@ struct io_group { struct request_list rl; }; +struct io_policy_node { + struct list_head node; + dev_t dev; + unsigned int weight; + unsigned short ioprio_class; +}; + /** * struct io_cgroup - io cgroup data structure. * @css: subsystem state for io in the containing cgroup. @@ -284,6 +291,9 @@ struct io_cgroup { unsigned int weight; unsigned short ioprio_class; + /* list of io_policy_node */ + struct list_head policy_list; + spinlock_t lock; struct hlist_head group_data; }; -- 1.6.0.6
WARNING: multiple messages have this Message-ID (diff)
From: Vivek Goyal <vgoyal@redhat.com> To: linux-kernel@vger.kernel.org, containers@lists.linux-foundation.org, dm-devel@redhat.com, jens.axboe@oracle.com, nauman@google.com, dpshah@google.com, lizf@cn.fujitsu.com, mikew@google Cc: peterz@infradead.org, akpm@linux-foundation.org, snitzer@redhat.com, agk@redhat.com, vgoyal@redhat.com Subject: [PATCH 23/25] io-controller: Support per cgroup per device weights and io class Date: Thu, 2 Jul 2009 16:01:55 -0400 [thread overview] Message-ID: <1246564917-19603-24-git-send-email-vgoyal@redhat.com> (raw) In-Reply-To: <1246564917-19603-1-git-send-email-vgoyal@redhat.com> This patch enables per-cgroup per-device weight and ioprio_class handling. A new cgroup interface "policy" is introduced. You can make use of this file to configure weight and ioprio_class for each device in a given cgroup. The original "weight" and "ioprio_class" files are still available. If you don't do special configuration for a particular device, "weight" and "ioprio_class" are used as default values in this device. You can use the following format to play with the new interface. #echo dev_major:dev_minor weight ioprio_class > /patch/to/cgroup/policy weight=0 means removing the policy for device. Examples: Configure weight=300 ioprio_class=2 on /dev/hdb (8:16) in this cgroup # echo "8:16 300 2" > io.policy # cat io.policy dev weight class 8:16 300 2 Configure weight=500 ioprio_class=1 on /dev/hda (8:0) in this cgroup # echo "8:0 500 1" > io.policy # cat io.policy dev weight class 8:0 500 1 8:16 300 2 Remove the policy for /dev/hda in this cgroup # echo 8:0 0 1 > io.policy # cat io.policy dev weight class 8:16 300 2 Changelog (v1 -> v2) - Rename some structures - Use spin_lock_irqsave() and spin_lock_irqrestore() version to prevent from enabling the interrupts unconditionally. - Fix policy setup bug when switching to another io scheduler. - If a policy is available for a specific device, don't update weight and io class when writing "weight" and "iprio_class". - Fix a bug when parsing policy string. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> --- block/elevator-fq.c | 266 ++++++++++++++++++++++++++++++++++++++++++++++++++- block/elevator-fq.h | 10 ++ 2 files changed, 272 insertions(+), 4 deletions(-) diff --git a/block/elevator-fq.c b/block/elevator-fq.c index 2a2b68d..31b066d 100644 --- a/block/elevator-fq.c +++ b/block/elevator-fq.c @@ -17,6 +17,7 @@ #include <linux/blktrace_api.h> #include <linux/seq_file.h> #include <linux/biotrack.h> +#include <linux/genhd.h> /* Values taken from cfq */ const int elv_slice_sync = HZ / 10; @@ -1053,12 +1054,31 @@ static void bfq_init_entity(struct io_entity *entity, struct io_group *iog) entity->sched_data = &iog->sched_data; } -static void io_group_init_entity(struct io_cgroup *iocg, struct io_group *iog) +static struct io_policy_node *policy_search_node(const struct io_cgroup *iocg, + dev_t dev); + +static void io_group_init_entity(struct io_cgroup *iocg, struct io_group *iog, + dev_t dev) { struct io_entity *entity = &iog->entity; + struct io_policy_node *pn; + unsigned long flags; + + spin_lock_irqsave(&iocg->lock, flags); + pn = policy_search_node(iocg, dev); + if (pn) { + entity->weight = pn->weight; + entity->new_weight = pn->weight; + entity->ioprio_class = pn->ioprio_class; + entity->new_ioprio_class = pn->ioprio_class; + } else { + entity->weight = iocg->weight; + entity->new_weight = iocg->weight; + entity->ioprio_class = iocg->ioprio_class; + entity->new_ioprio_class = iocg->ioprio_class; + } + spin_unlock_irqrestore(&iocg->lock, flags); - entity->weight = entity->new_weight = iocg->weight; - entity->ioprio_class = entity->new_ioprio_class = iocg->ioprio_class; entity->ioprio_changed = 1; entity->my_sched_data = &iog->sched_data; } @@ -1174,6 +1194,227 @@ io_cgroup_lookup_group(struct io_cgroup *iocg, void *key) return NULL; } +static int io_cgroup_policy_read(struct cgroup *cgrp, struct cftype *cft, + struct seq_file *m) +{ + struct io_cgroup *iocg; + struct io_policy_node *pn; + + iocg = cgroup_to_io_cgroup(cgrp); + + if (list_empty(&iocg->policy_list)) + goto out; + + seq_printf(m, "dev\tweight\tclass\n"); + + spin_lock_irq(&iocg->lock); + list_for_each_entry(pn, &iocg->policy_list, node) { + seq_printf(m, "%u:%u\t%u\t%hu\n", MAJOR(pn->dev), + MINOR(pn->dev), pn->weight, pn->ioprio_class); + } + spin_unlock_irq(&iocg->lock); +out: + return 0; +} + +static inline void policy_insert_node(struct io_cgroup *iocg, + struct io_policy_node *pn) +{ + list_add(&pn->node, &iocg->policy_list); +} + +/* Must be called with iocg->lock held */ +static inline void policy_delete_node(struct io_policy_node *pn) +{ + list_del(&pn->node); +} + +/* Must be called with iocg->lock held */ +static struct io_policy_node *policy_search_node(const struct io_cgroup *iocg, + dev_t dev) +{ + struct io_policy_node *pn; + + if (list_empty(&iocg->policy_list)) + return NULL; + + list_for_each_entry(pn, &iocg->policy_list, node) { + if (pn->dev == dev) + return pn; + } + + return NULL; +} + +static int check_dev_num(dev_t dev) +{ + int part = 0; + struct gendisk *disk; + + disk = get_gendisk(dev, &part); + if (!disk || part) + return -ENODEV; + + return 0; +} + +static int policy_parse_and_set(char *buf, struct io_policy_node *newpn) +{ + char *s[4], *p, *major_s = NULL, *minor_s = NULL; + int ret; + unsigned long major, minor, temp; + int i = 0; + dev_t dev; + + memset(s, 0, sizeof(s)); + while ((p = strsep(&buf, " ")) != NULL) { + if (!*p) + continue; + s[i++] = p; + + /* Prevent from inputing too many things */ + if (i == 4) + break; + } + + if (i != 3) + return -EINVAL; + + p = strsep(&s[0], ":"); + if (p != NULL) + major_s = p; + else + return -EINVAL; + + minor_s = s[0]; + if (!minor_s) + return -EINVAL; + + ret = strict_strtoul(major_s, 10, &major); + if (ret) + return -EINVAL; + + ret = strict_strtoul(minor_s, 10, &minor); + if (ret) + return -EINVAL; + + dev = MKDEV(major, minor); + + ret = check_dev_num(dev); + if (ret) + return ret; + + newpn->dev = dev; + + if (s[1] == NULL) + return -EINVAL; + + ret = strict_strtoul(s[1], 10, &temp); + if (ret || temp > WEIGHT_MAX) + return -EINVAL; + + newpn->weight = temp; + + if (s[2] == NULL) + return -EINVAL; + + ret = strict_strtoul(s[2], 10, &temp); + if (ret || temp < IOPRIO_CLASS_RT || temp > IOPRIO_CLASS_IDLE) + return -EINVAL; + newpn->ioprio_class = temp; + + return 0; +} + +static int io_cgroup_policy_write(struct cgroup *cgrp, struct cftype *cft, + const char *buffer) +{ + struct io_cgroup *iocg; + struct io_policy_node *newpn, *pn; + char *buf; + int ret = 0; + int keep_newpn = 0; + struct hlist_node *n; + struct io_group *iog; + + buf = kstrdup(buffer, GFP_KERNEL); + if (!buf) + return -ENOMEM; + + newpn = kzalloc(sizeof(*newpn), GFP_KERNEL); + if (!newpn) { + ret = -ENOMEM; + goto free_buf; + } + + ret = policy_parse_and_set(buf, newpn); + if (ret) + goto free_newpn; + + if (!cgroup_lock_live_group(cgrp)) { + ret = -ENODEV; + goto free_newpn; + } + + iocg = cgroup_to_io_cgroup(cgrp); + spin_lock_irq(&iocg->lock); + + pn = policy_search_node(iocg, newpn->dev); + if (!pn) { + if (newpn->weight != 0) { + policy_insert_node(iocg, newpn); + keep_newpn = 1; + } + goto update_io_group; + } + + if (newpn->weight == 0) { + /* weight == 0 means deleteing a policy */ + policy_delete_node(pn); + goto update_io_group; + } + + pn->weight = newpn->weight; + pn->ioprio_class = newpn->ioprio_class; + +update_io_group: + hlist_for_each_entry(iog, n, &iocg->group_data, group_node) { + if (iog->dev == newpn->dev) { + if (newpn->weight) { + iog->entity.new_weight = newpn->weight; + iog->entity.new_ioprio_class = + newpn->ioprio_class; + /* + * iog weight and ioprio_class updating + * actually happens if ioprio_changed is set. + * So ensure ioprio_changed is not set until + * new weight and new ioprio_class are updated. + */ + smp_wmb(); + iog->entity.ioprio_changed = 1; + } else { + iog->entity.new_weight = iocg->weight; + iog->entity.new_ioprio_class = + iocg->ioprio_class; + + /* The same as above */ + smp_wmb(); + iog->entity.ioprio_changed = 1; + } + } + } + spin_unlock_irq(&iocg->lock); + + cgroup_unlock(); + +free_newpn: + if (!keep_newpn) + kfree(newpn); +free_buf: + kfree(buf); + return ret; +} + #define SHOW_FUNCTION(__VAR) \ static u64 io_cgroup_##__VAR##_read(struct cgroup *cgroup, \ struct cftype *cftype) \ @@ -1206,6 +1447,7 @@ static int io_cgroup_##__VAR##_write(struct cgroup *cgroup, \ struct io_cgroup *iocg; \ struct io_group *iog; \ struct hlist_node *n; \ + struct io_policy_node *pn; \ \ if (val < (__MIN) || val > (__MAX)) \ return -EINVAL; \ @@ -1218,6 +1460,9 @@ static int io_cgroup_##__VAR##_write(struct cgroup *cgroup, \ spin_lock_irq(&iocg->lock); \ iocg->__VAR = (unsigned long)val; \ hlist_for_each_entry(iog, n, &iocg->group_data, group_node) { \ + pn = policy_search_node(iocg, iog->dev); \ + if (pn) \ + continue; \ iog->entity.new_##__VAR = (unsigned long)val; \ smp_wmb(); \ iog->entity.ioprio_changed = 1; \ @@ -1295,6 +1540,12 @@ static int io_cgroup_disk_sectors_read(struct cgroup *cgroup, struct cftype bfqio_files[] = { { + .name = "policy", + .read_seq_string = io_cgroup_policy_read, + .write_string = io_cgroup_policy_write, + .max_write_len = 256, + }, + { .name = "weight", .read_u64 = io_cgroup_weight_read, .write_u64 = io_cgroup_weight_write, @@ -1336,6 +1587,7 @@ static struct cgroup_subsys_state *iocg_create(struct cgroup_subsys *subsys, INIT_HLIST_HEAD(&iocg->group_data); iocg->weight = IO_DEFAULT_GRP_WEIGHT; iocg->ioprio_class = IO_DEFAULT_GRP_CLASS; + INIT_LIST_HEAD(&iocg->policy_list); return &iocg->css; } @@ -1438,7 +1690,7 @@ io_group_chain_alloc(struct request_queue *q, void *key, struct cgroup *cgroup) sscanf(dev_name(bdi->dev), "%u:%u", &major, &minor); iog->dev = MKDEV(major, minor); - io_group_init_entity(iocg, iog); + io_group_init_entity(iocg, iog, iog->dev); iog->my_entity = &iog->entity; atomic_set(&iog->ref, 0); @@ -1904,6 +2156,7 @@ static void iocg_destroy(struct cgroup_subsys *subsys, struct cgroup *cgroup) struct io_group *iog; struct elv_fq_data *efqd; unsigned long uninitialized_var(flags); + struct io_policy_node *pn, *pntmp; /* * io groups are linked in two lists. One list is maintained @@ -1943,6 +2196,11 @@ remove_entry: goto remove_entry; done: + list_for_each_entry_safe(pn, pntmp, &iocg->policy_list, node) { + policy_delete_node(pn); + kfree(pn); + } + free_css_id(&io_subsys, &iocg->css); rcu_read_unlock(); BUG_ON(!hlist_empty(&iocg->group_data)); diff --git a/block/elevator-fq.h b/block/elevator-fq.h index 214fb61..58c650b 100644 --- a/block/elevator-fq.h +++ b/block/elevator-fq.h @@ -267,6 +267,13 @@ struct io_group { struct request_list rl; }; +struct io_policy_node { + struct list_head node; + dev_t dev; + unsigned int weight; + unsigned short ioprio_class; +}; + /** * struct io_cgroup - io cgroup data structure. * @css: subsystem state for io in the containing cgroup. @@ -284,6 +291,9 @@ struct io_cgroup { unsigned int weight; unsigned short ioprio_class; + /* list of io_policy_node */ + struct list_head policy_list; + spinlock_t lock; struct hlist_head group_data; }; -- 1.6.0.6
next prev parent reply other threads:[~2009-07-02 20:05 UTC|newest] Thread overview: 191+ messages / expand[flat|nested] mbox.gz Atom feed top 2009-07-02 20:01 [RFC] IO scheduler based IO controller V6 Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-02 20:01 ` [PATCH 01/25] io-controller: Documentation Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-02 20:01 ` [PATCH 02/25] io-controller: Core of the B-WF2Q+ scheduler Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-02 20:01 ` [PATCH 03/25] io-controller: bfq support of in-class preemption Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-02 20:01 ` [PATCH 04/25] io-controller: Common flat fair queuing code in elevaotor layer Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-02 20:01 ` [PATCH 05/25] io-controller: Charge for time slice based on average disk rate Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-02 20:01 ` [PATCH 06/25] io-controller: Modify cfq to make use of flat elevator fair queuing Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-02 20:01 ` [PATCH 07/25] io-controller: core bfq scheduler changes for hierarchical setup Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-02 20:01 ` [PATCH 08/25] io-controller: cgroup related changes for hierarchical group support Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-02 20:01 ` [PATCH 09/25] io-controller: Common hierarchical fair queuing code in elevaotor layer Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal [not found] ` <1246564917-19603-10-git-send-email-vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-07-06 2:46 ` Gui Jianfeng 2009-07-06 2:46 ` Gui Jianfeng 2009-07-06 2:46 ` Gui Jianfeng 2009-07-06 14:16 ` Vivek Goyal 2009-07-06 14:16 ` Vivek Goyal [not found] ` <20090706141650.GD8279-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-07-07 1:40 ` [PATCH] io-controller: Get rid of css id from io cgroup Gui Jianfeng 2009-07-07 1:40 ` Gui Jianfeng 2009-07-08 14:04 ` Vivek Goyal 2009-07-08 14:04 ` Vivek Goyal [not found] ` <4A52A77E.8050203-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> 2009-07-08 14:04 ` Vivek Goyal [not found] ` <4A51657B.7000008-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> 2009-07-06 14:16 ` [PATCH 09/25] io-controller: Common hierarchical fair queuing code in elevaotor layer Vivek Goyal 2009-07-02 20:01 ` [PATCH 10/25] io-controller: cfq changes to use " Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-02 20:01 ` [PATCH 11/25] io-controller: Export disk time used and nr sectors dipatched through cgroups Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-08 2:16 ` Gui Jianfeng 2009-07-08 2:16 ` Gui Jianfeng 2009-07-08 14:00 ` Vivek Goyal 2009-07-08 14:00 ` Vivek Goyal [not found] ` <4A54018C.5090804-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> 2009-07-08 14:00 ` Vivek Goyal [not found] ` <1246564917-19603-12-git-send-email-vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-07-08 2:16 ` Gui Jianfeng 2009-07-02 20:01 ` [PATCH 12/25] io-controller: idle for sometime on sync queue before expiring it Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal [not found] ` <1246564917-19603-1-git-send-email-vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-07-02 20:01 ` [PATCH 01/25] io-controller: Documentation Vivek Goyal 2009-07-02 20:01 ` [PATCH 02/25] io-controller: Core of the B-WF2Q+ scheduler Vivek Goyal 2009-07-02 20:01 ` [PATCH 03/25] io-controller: bfq support of in-class preemption Vivek Goyal 2009-07-02 20:01 ` [PATCH 04/25] io-controller: Common flat fair queuing code in elevaotor layer Vivek Goyal 2009-07-02 20:01 ` [PATCH 05/25] io-controller: Charge for time slice based on average disk rate Vivek Goyal 2009-07-02 20:01 ` [PATCH 06/25] io-controller: Modify cfq to make use of flat elevator fair queuing Vivek Goyal 2009-07-02 20:01 ` [PATCH 07/25] io-controller: core bfq scheduler changes for hierarchical setup Vivek Goyal 2009-07-02 20:01 ` [PATCH 08/25] io-controller: cgroup related changes for hierarchical group support Vivek Goyal 2009-07-02 20:01 ` [PATCH 09/25] io-controller: Common hierarchical fair queuing code in elevaotor layer Vivek Goyal 2009-07-02 20:01 ` [PATCH 10/25] io-controller: cfq changes to use " Vivek Goyal 2009-07-02 20:01 ` [PATCH 11/25] io-controller: Export disk time used and nr sectors dipatched through cgroups Vivek Goyal 2009-07-02 20:01 ` [PATCH 12/25] io-controller: idle for sometime on sync queue before expiring it Vivek Goyal 2009-07-02 20:01 ` [PATCH 13/25] io-controller: Wait for requests to complete from last queue before new queue is scheduled Vivek Goyal 2009-07-02 20:01 ` [PATCH 14/25] io-controller: Separate out queue and data Vivek Goyal 2009-07-02 20:01 ` [PATCH 15/25] io-conroller: Prepare elevator layer for single queue schedulers Vivek Goyal 2009-07-02 20:01 ` [PATCH 16/25] io-controller: noop changes for hierarchical fair queuing Vivek Goyal 2009-07-02 20:01 ` [PATCH 17/25] io-controller: deadline " Vivek Goyal 2009-07-02 20:01 ` [PATCH 18/25] io-controller: anticipatory " Vivek Goyal 2009-07-02 20:01 ` [PATCH 19/25] blkio_cgroup patches from Ryo to track async bios Vivek Goyal 2009-07-02 20:01 ` [PATCH 20/25] io-controller: map async requests to appropriate cgroup Vivek Goyal 2009-07-02 20:01 ` [PATCH 21/25] io-controller: Per cgroup request descriptor support Vivek Goyal 2009-07-02 20:01 ` [PATCH 22/25] io-controller: Per io group bdi congestion interface Vivek Goyal 2009-07-02 20:01 ` [PATCH 23/25] io-controller: Support per cgroup per device weights and io class Vivek Goyal 2009-07-02 20:01 ` [PATCH 24/25] io-controller: Debug hierarchical IO scheduling Vivek Goyal 2009-07-02 20:01 ` [PATCH 25/25] io-controller: experimental debug patch for async queue wait before expiry Vivek Goyal 2009-07-08 3:56 ` [RFC] IO scheduler based IO controller V6 Balbir Singh 2009-07-10 1:56 ` [PATCH] io-controller: implement per group request allocation limitation Gui Jianfeng 2009-07-27 2:10 ` [RFC] IO scheduler based IO controller V6 Gui Jianfeng 2009-07-02 20:01 ` [PATCH 13/25] io-controller: Wait for requests to complete from last queue before new queue is scheduled Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-02 20:09 ` Nauman Rafique 2009-07-02 20:09 ` Nauman Rafique [not found] ` <e98e18940907021309u1f784b3at409b55ba46ed108c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2009-07-02 20:17 ` Vivek Goyal 2009-07-02 20:17 ` Vivek Goyal 2009-07-02 20:17 ` Vivek Goyal [not found] ` <1246564917-19603-14-git-send-email-vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-07-02 20:09 ` Nauman Rafique 2009-07-02 20:01 ` [PATCH 14/25] io-controller: Separate out queue and data Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-02 20:01 ` [PATCH 15/25] io-conroller: Prepare elevator layer for single queue schedulers Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-02 20:01 ` [PATCH 16/25] io-controller: noop changes for hierarchical fair queuing Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-02 20:01 ` [PATCH 17/25] io-controller: deadline " Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-02 20:01 ` [PATCH 18/25] io-controller: anticipatory " Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-02 20:01 ` [PATCH 19/25] blkio_cgroup patches from Ryo to track async bios Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-02 20:01 ` [PATCH 20/25] io-controller: map async requests to appropriate cgroup Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal [not found] ` <1246564917-19603-21-git-send-email-vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-08-03 2:13 ` Gui Jianfeng 2009-08-03 2:13 ` Gui Jianfeng 2009-08-03 2:13 ` Gui Jianfeng [not found] ` <4A7647DA.5050607-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> 2009-08-04 1:25 ` Vivek Goyal 2009-08-04 1:25 ` Vivek Goyal 2009-08-04 1:25 ` Vivek Goyal 2009-07-02 20:01 ` [PATCH 21/25] io-controller: Per cgroup request descriptor support Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-08 3:27 ` Gui Jianfeng 2009-07-08 3:27 ` Gui Jianfeng [not found] ` <4A54121D.5090008-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> 2009-07-08 13:57 ` Vivek Goyal 2009-07-08 13:57 ` Vivek Goyal 2009-07-08 13:57 ` Vivek Goyal [not found] ` <1246564917-19603-22-git-send-email-vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-07-08 3:27 ` Gui Jianfeng 2009-07-21 5:37 ` Gui Jianfeng 2009-07-21 5:37 ` Gui Jianfeng [not found] ` <4A655434.5060404-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> 2009-07-21 5:55 ` Nauman Rafique 2009-07-21 5:55 ` Nauman Rafique 2009-07-21 5:55 ` Nauman Rafique 2009-07-21 14:01 ` Vivek Goyal 2009-07-21 14:01 ` Vivek Goyal [not found] ` <20090721140134.GB540-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-07-21 17:57 ` Nauman Rafique 2009-07-21 17:57 ` Nauman Rafique 2009-07-21 17:57 ` Nauman Rafique [not found] ` <e98e18940907202255y5c7c546ei95d87e5a451ad0c2-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2009-07-21 14:01 ` Vivek Goyal 2009-07-02 20:01 ` [PATCH 22/25] io-controller: Per io group bdi congestion interface Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-17 0:16 ` Munehiro Ikeda 2009-07-17 0:16 ` Munehiro Ikeda 2009-07-17 13:52 ` Vivek Goyal 2009-07-17 13:52 ` Vivek Goyal [not found] ` <4A5FC2CA.1040609-MDRzhb/z0dd8UrSeD/g0lQ@public.gmane.org> 2009-07-17 13:52 ` Vivek Goyal [not found] ` <1246564917-19603-23-git-send-email-vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-07-17 0:16 ` Munehiro Ikeda 2009-07-02 20:01 ` Vivek Goyal [this message] 2009-07-02 20:01 ` [PATCH 23/25] io-controller: Support per cgroup per device weights and io class Vivek Goyal 2009-07-02 20:01 ` [PATCH 24/25] io-controller: Debug hierarchical IO scheduling Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-02 20:01 ` [PATCH 25/25] io-controller: experimental debug patch for async queue wait before expiry Vivek Goyal 2009-07-02 20:01 ` Vivek Goyal 2009-07-08 3:56 ` [RFC] IO scheduler based IO controller V6 Balbir Singh 2009-07-08 3:56 ` Balbir Singh [not found] ` <20090708035621.GB3215-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org> 2009-07-08 13:41 ` Vivek Goyal 2009-07-08 13:41 ` Vivek Goyal 2009-07-08 13:41 ` Vivek Goyal [not found] ` <20090708134114.GA24048-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-07-08 14:39 ` Balbir Singh 2009-07-08 14:39 ` Balbir Singh 2009-07-08 14:39 ` Balbir Singh [not found] ` <20090708143925.GE3215-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org> 2009-07-09 1:58 ` Vivek Goyal 2009-07-09 1:58 ` Vivek Goyal 2009-07-09 1:58 ` Vivek Goyal 2009-07-10 1:56 ` [PATCH] io-controller: implement per group request allocation limitation Gui Jianfeng 2009-07-10 1:56 ` Gui Jianfeng 2009-07-13 16:03 ` Vivek Goyal 2009-07-13 16:03 ` Vivek Goyal 2009-07-13 21:08 ` Munehiro Ikeda 2009-07-13 21:08 ` Munehiro Ikeda 2009-07-14 7:45 ` Gui Jianfeng 2009-07-14 7:45 ` Gui Jianfeng 2009-08-04 2:00 ` Munehiro Ikeda 2009-08-04 2:00 ` Munehiro Ikeda [not found] ` <4A77964A.7040602-MDRzhb/z0dd8UrSeD/g0lQ@public.gmane.org> 2009-08-04 6:38 ` Gui Jianfeng 2009-08-04 22:37 ` Vivek Goyal 2009-08-04 6:38 ` Gui Jianfeng 2009-08-04 6:38 ` Gui Jianfeng 2009-08-04 22:37 ` Vivek Goyal 2009-08-04 22:37 ` Vivek Goyal [not found] ` <4A5C377F.4040105-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> 2009-08-04 2:00 ` Munehiro Ikeda [not found] ` <4A5BA238.3030902-MDRzhb/z0dd8UrSeD/g0lQ@public.gmane.org> 2009-07-14 7:45 ` Gui Jianfeng [not found] ` <20090713160352.GA3714-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-07-13 21:08 ` Munehiro Ikeda 2009-07-14 7:37 ` Gui Jianfeng 2009-07-14 7:37 ` Gui Jianfeng 2009-08-04 2:02 ` Munehiro Ikeda 2009-08-04 2:02 ` Munehiro Ikeda 2009-08-04 6:41 ` Gui Jianfeng 2009-08-04 6:41 ` Gui Jianfeng [not found] ` <4A7796D2.4030104-MDRzhb/z0dd8UrSeD/g0lQ@public.gmane.org> 2009-08-04 6:41 ` Gui Jianfeng 2009-08-04 2:04 ` Munehiro Ikeda 2009-08-04 2:04 ` Munehiro Ikeda [not found] ` <4A779719.1070900-MDRzhb/z0dd8UrSeD/g0lQ@public.gmane.org> 2009-08-04 6:45 ` Gui Jianfeng 2009-08-04 6:45 ` Gui Jianfeng 2009-08-04 6:45 ` Gui Jianfeng [not found] ` <4A569FC5.7090801-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> 2009-07-13 16:03 ` Vivek Goyal 2009-08-04 2:02 ` Munehiro Ikeda 2009-08-04 2:04 ` Munehiro Ikeda 2009-07-27 2:10 ` [RFC] IO scheduler based IO controller V6 Gui Jianfeng [not found] ` <4A6D0C9A.3080600-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> 2009-07-27 12:55 ` Vivek Goyal 2009-07-27 12:55 ` Vivek Goyal 2009-07-27 12:55 ` Vivek Goyal 2009-07-28 3:27 ` Vivek Goyal 2009-07-28 3:27 ` Vivek Goyal [not found] ` <20090728032712.GC3620-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-07-28 3:36 ` Gui Jianfeng 2009-07-28 3:36 ` Gui Jianfeng 2009-07-28 3:36 ` Gui Jianfeng [not found] ` <20090727125503.GA24449-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-07-28 3:27 ` Vivek Goyal 2009-07-28 11:36 ` Gui Jianfeng 2009-07-29 9:07 ` Gui Jianfeng 2009-07-28 11:36 ` Gui Jianfeng 2009-07-29 9:07 ` Gui Jianfeng
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1246564917-19603-24-git-send-email-vgoyal@redhat.com \ --to=vgoyal@redhat.com \ --cc=agk@redhat.com \ --cc=akpm@linux-foundation.org \ --cc=balbir@linux.vnet.ibm.com \ --cc=containers@lists.linux-foundation.org \ --cc=dhaval@linux.vnet.ibm.com \ --cc=dm-devel@redhat.com \ --cc=dpshah@google.com \ --cc=fchecconi@gmail.com \ --cc=fernando@oss.ntt.co.jp \ --cc=guijianfeng@cn.fujitsu.com \ --cc=jbaron@redhat.com \ --cc=jens.axboe@oracle.com \ --cc=jmoyer@redhat.com \ --cc=linux-kernel@vger.kernel.org \ --cc=lizf@cn.fujitsu.com \ --cc=m-ikeda@ds.jp.nec.com \ --cc=mikew@google.com \ --cc=nauman@google.com \ --cc=paolo.valente@unimore.it \ --cc=peterz@infradead.org \ --cc=righi.andrea@gmail.com \ --cc=ryov@valinux.co.jp \ --cc=s-uchida@ap.jp.nec.com \ --cc=snitzer@redhat.com \ --cc=taka@valinux.co.jp \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.