All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/7] blkio-cgroup: Introduction
@ 2009-04-28 12:44 Ryo Tsuruta
  2009-04-28 12:44 ` [PATCH 2/7] blkio-cgroup: The new page_cgroup framework Ryo Tsuruta
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-04-28 12:44 UTC (permalink / raw)
  To: linux-kernel, dm-devel, containers, virtualization, xen-devel

Hi all,

This is a new release of blkio-cgroup which provides an IO tracking
mechanism. You can also download this series of patches from
http://people.valinux.co.jp/~ryov/blkio-cgroup/

Changes from the previous release
=================================
- bio-cgroup renamed to blkio-cgroup.
- Use part of page_cgroup->flags to store the blkio-cgroup ID.
  This code is taken from Andrea's io-throttle.
  http://download.systemimager.org/~arighi/linux/patches/io-throttle/cgroup-io-throttle-v14.patch
- Add a new function blkio_cgroup_lookup(ID) which can be called from
  other cgroup subsystems and return the cgroup associated with a
  given blkio-cgroup ID. It makes it easy to use blkio-cgroup from other
  cgroup subsystems.
- Add an extra patch which reduces the overhead of IO tracking.
- Can be applied to 2.6.30-rc3 and 2.6.30-rc3-git3.

What's blkio-cgroup all about?
==============================

With this feature, you can determine the owners of any type of
I/Os. This makes dm-ioband_--_I/O_bandwidth_controller_-- be able to
control the Block I/O bandwidths even when it accepts delayed write
requests. Dm-ioband can find the owner cgroup of each request. It is
also possible that the other people who work on the I/O bandwidth
throttling use this functionality to control asynchronous I/Os with a
little enhancement.

Setting up blkio-cgroup
=======================

You have to apply the patch dm-ioband before applying this series of
blkio-cgroup patches.
And you have to select the following config options when compiling
kernel.

      CONFIG_CGROUPS=y
      CONFIG_CGROUP_BLKIO=y

And I recommend you should also select the options for cgroup memory
subsystem, because it makes it possible to give some I/O bandwidth and
some memory to a certain cgroup to control delayed write requests and
the processes in the cgroup will be able to make pages dirty only
inside the cgroup even when the given bandwidth is narrow.

      CONFIG_RESOURCE_COUNTERS=y
      CONFIG_CGROUP_MEM_RES_CTLR=y

Using blkio-cgroup
==================

The following shows how to use dm-ioband with cgroups. Please assume
that you want make two cgroups, which we call "bio cgroup" here, to
track down block I/Os and assign them to ioband device "ioband1".

First, mount the bio cgroup filesystem.

      # mount -t cgroup -o blkio none /cgroup

Then, make new bio cgroups and put some processes in them.

      # mkdir /cgroup/grp1
      # mkdir /cgroup/grp2
      # echo 1234 > /cgroup/grp1/tasks
      # echo 5678 > /cgroup/grp2/tasks

Now, check the ID of each blkio cgroup which is just created.

      # cat /cgroup/grp1/blkio.id
      2
      # cat /cgroup/grp2/blkio.id
      3

Finally, attach the cgroups to "ioband1" and assign them weights.

      # dmsetup message ioband1 0 type cgroup
      # dmsetup message ioband1 0 attach 2
      # dmsetup message ioband1 0 attach 3
      # dmsetup message ioband1 0 weight 2:30
      # dmsetup message ioband1 0 weight 3:60

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 2/7] blkio-cgroup: The new page_cgroup framework
       [not found] ` <20090428.214423.193684619.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
@ 2009-04-28 12:44   ` Ryo Tsuruta
  0 siblings, 0 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-04-28 12:44 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	xen-devel-GuqFBffKawuULHF6PoxzQEEOCMrvLtNR

This patch makes the page_cgroup framework be able to be used even if
the compile option of the cgroup memory controller is off.
So blkio-cgroup can use this framework without the memory controller.

Based on 2.6.30-rc3-git3
Signed-off-by: Hirokazu Takahashi <taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
Signed-off-by: Ryo Tsuruta <ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>

---
 include/linux/memcontrol.h  |    6 ++++++
 include/linux/mmzone.h      |    4 ++--
 include/linux/page_cgroup.h |    8 +++++---
 init/Kconfig                |    4 ++++
 mm/Makefile                 |    3 ++-
 mm/memcontrol.c             |    6 ++++++
 mm/page_cgroup.c            |    3 +--
 7 files changed, 26 insertions(+), 8 deletions(-)

Index: linux-2.6.30-rc3-git3/include/linux/memcontrol.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/memcontrol.h
+++ linux-2.6.30-rc3-git3/include/linux/memcontrol.h
@@ -37,6 +37,8 @@ struct mm_struct;
  * (Of course, if memcg does memory allocation in future, GFP_KERNEL is sane.)
  */
 
+extern void __init_mem_page_cgroup(struct page_cgroup *pc);
+
 extern int mem_cgroup_newpage_charge(struct page *page, struct mm_struct *mm,
 				gfp_t gfp_mask);
 /* for swap handling */
@@ -120,6 +122,10 @@ extern bool mem_cgroup_oom_called(struct
 #else /* CONFIG_CGROUP_MEM_RES_CTLR */
 struct mem_cgroup;
 
+static inline void __init_mem_page_cgroup(struct page_cgroup *pc)
+{
+}
+
 static inline int mem_cgroup_newpage_charge(struct page *page,
 					struct mm_struct *mm, gfp_t gfp_mask)
 {
Index: linux-2.6.30-rc3-git3/include/linux/mmzone.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/mmzone.h
+++ linux-2.6.30-rc3-git3/include/linux/mmzone.h
@@ -607,7 +607,7 @@ typedef struct pglist_data {
 	int nr_zones;
 #ifdef CONFIG_FLAT_NODE_MEM_MAP	/* means !SPARSEMEM */
 	struct page *node_mem_map;
-#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+#ifdef CONFIG_CGROUP_PAGE
 	struct page_cgroup *node_page_cgroup;
 #endif
 #endif
@@ -958,7 +958,7 @@ struct mem_section {
 
 	/* See declaration of similar field in struct zone */
 	unsigned long *pageblock_flags;
-#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+#ifdef CONFIG_CGROUP_PAGE
 	/*
 	 * If !SPARSEMEM, pgdat doesn't have page_cgroup pointer. We use
 	 * section. (see memcontrol.h/page_cgroup.h about this.)
Index: linux-2.6.30-rc3-git3/include/linux/page_cgroup.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/page_cgroup.h
+++ linux-2.6.30-rc3-git3/include/linux/page_cgroup.h
@@ -1,7 +1,7 @@
 #ifndef __LINUX_PAGE_CGROUP_H
 #define __LINUX_PAGE_CGROUP_H
 
-#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+#ifdef CONFIG_CGROUP_PAGE
 #include <linux/bit_spinlock.h>
 /*
  * Page Cgroup can be considered as an extended mem_map.
@@ -12,9 +12,11 @@
  */
 struct page_cgroup {
 	unsigned long flags;
-	struct mem_cgroup *mem_cgroup;
 	struct page *page;
+#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+	struct mem_cgroup *mem_cgroup;
 	struct list_head lru;		/* per cgroup LRU list */
+#endif
 };
 
 void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat);
@@ -71,7 +73,7 @@ static inline void unlock_page_cgroup(st
 	bit_spin_unlock(PCG_LOCK, &pc->flags);
 }
 
-#else /* CONFIG_CGROUP_MEM_RES_CTLR */
+#else /* CONFIG_CGROUP_PAGE */
 struct page_cgroup;
 
 static inline void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat)
Index: linux-2.6.30-rc3-git3/init/Kconfig
===================================================================
--- linux-2.6.30-rc3-git3.orig/init/Kconfig
+++ linux-2.6.30-rc3-git3/init/Kconfig
@@ -608,6 +608,10 @@ config CGROUP_MEM_RES_CTLR_SWAP
 
 endif # CGROUPS
 
+config CGROUP_PAGE
+	def_bool y
+	depends on CGROUP_MEM_RES_CTLR
+
 config MM_OWNER
 	bool
 
Index: linux-2.6.30-rc3-git3/mm/Makefile
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/Makefile
+++ linux-2.6.30-rc3-git3/mm/Makefile
@@ -37,4 +37,5 @@ else
 obj-$(CONFIG_SMP) += allocpercpu.o
 endif
 obj-$(CONFIG_QUICKLIST) += quicklist.o
-obj-$(CONFIG_CGROUP_MEM_RES_CTLR) += memcontrol.o page_cgroup.o
+obj-$(CONFIG_CGROUP_MEM_RES_CTLR) += memcontrol.o
+obj-$(CONFIG_CGROUP_PAGE) += page_cgroup.o
Index: linux-2.6.30-rc3-git3/mm/memcontrol.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/memcontrol.c
+++ linux-2.6.30-rc3-git3/mm/memcontrol.c
@@ -128,6 +128,12 @@ struct mem_cgroup_lru_info {
 	struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
 };
 
+void __meminit __init_mem_page_cgroup(struct page_cgroup *pc)
+{
+	pc->mem_cgroup = NULL;
+	INIT_LIST_HEAD(&pc->lru);
+}
+
 /*
  * The memory controller data structure. The memory controller controls both
  * page cache and RSS per cgroup. We would eventually like to provide
Index: linux-2.6.30-rc3-git3/mm/page_cgroup.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/page_cgroup.c
+++ linux-2.6.30-rc3-git3/mm/page_cgroup.c
@@ -14,9 +14,8 @@ static void __meminit
 __init_page_cgroup(struct page_cgroup *pc, unsigned long pfn)
 {
 	pc->flags = 0;
-	pc->mem_cgroup = NULL;
 	pc->page = pfn_to_page(pfn);
-	INIT_LIST_HEAD(&pc->lru);
+	__init_mem_page_cgroup(pc);
 }
 static unsigned long total_usage;

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 2/7] blkio-cgroup: The new page_cgroup framework
  2009-04-28 12:44 [PATCH 1/7] blkio-cgroup: Introduction Ryo Tsuruta
  2009-04-28 12:44 ` [PATCH 2/7] blkio-cgroup: The new page_cgroup framework Ryo Tsuruta
@ 2009-04-28 12:44 ` Ryo Tsuruta
       [not found]   ` <20090428.214457.226801820.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
                     ` (2 more replies)
       [not found] ` <20090428.214423.193684619.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
  2 siblings, 3 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-04-28 12:44 UTC (permalink / raw)
  To: linux-kernel, dm-devel, containers, virtualization, xen-devel

This patch makes the page_cgroup framework be able to be used even if
the compile option of the cgroup memory controller is off.
So blkio-cgroup can use this framework without the memory controller.

Based on 2.6.30-rc3-git3
Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
Signed-off-by: Ryo Tsuruta <ryov@valinux.co.jp>

---
 include/linux/memcontrol.h  |    6 ++++++
 include/linux/mmzone.h      |    4 ++--
 include/linux/page_cgroup.h |    8 +++++---
 init/Kconfig                |    4 ++++
 mm/Makefile                 |    3 ++-
 mm/memcontrol.c             |    6 ++++++
 mm/page_cgroup.c            |    3 +--
 7 files changed, 26 insertions(+), 8 deletions(-)

Index: linux-2.6.30-rc3-git3/include/linux/memcontrol.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/memcontrol.h
+++ linux-2.6.30-rc3-git3/include/linux/memcontrol.h
@@ -37,6 +37,8 @@ struct mm_struct;
  * (Of course, if memcg does memory allocation in future, GFP_KERNEL is sane.)
  */
 
+extern void __init_mem_page_cgroup(struct page_cgroup *pc);
+
 extern int mem_cgroup_newpage_charge(struct page *page, struct mm_struct *mm,
 				gfp_t gfp_mask);
 /* for swap handling */
@@ -120,6 +122,10 @@ extern bool mem_cgroup_oom_called(struct
 #else /* CONFIG_CGROUP_MEM_RES_CTLR */
 struct mem_cgroup;
 
+static inline void __init_mem_page_cgroup(struct page_cgroup *pc)
+{
+}
+
 static inline int mem_cgroup_newpage_charge(struct page *page,
 					struct mm_struct *mm, gfp_t gfp_mask)
 {
Index: linux-2.6.30-rc3-git3/include/linux/mmzone.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/mmzone.h
+++ linux-2.6.30-rc3-git3/include/linux/mmzone.h
@@ -607,7 +607,7 @@ typedef struct pglist_data {
 	int nr_zones;
 #ifdef CONFIG_FLAT_NODE_MEM_MAP	/* means !SPARSEMEM */
 	struct page *node_mem_map;
-#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+#ifdef CONFIG_CGROUP_PAGE
 	struct page_cgroup *node_page_cgroup;
 #endif
 #endif
@@ -958,7 +958,7 @@ struct mem_section {
 
 	/* See declaration of similar field in struct zone */
 	unsigned long *pageblock_flags;
-#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+#ifdef CONFIG_CGROUP_PAGE
 	/*
 	 * If !SPARSEMEM, pgdat doesn't have page_cgroup pointer. We use
 	 * section. (see memcontrol.h/page_cgroup.h about this.)
Index: linux-2.6.30-rc3-git3/include/linux/page_cgroup.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/page_cgroup.h
+++ linux-2.6.30-rc3-git3/include/linux/page_cgroup.h
@@ -1,7 +1,7 @@
 #ifndef __LINUX_PAGE_CGROUP_H
 #define __LINUX_PAGE_CGROUP_H
 
-#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+#ifdef CONFIG_CGROUP_PAGE
 #include <linux/bit_spinlock.h>
 /*
  * Page Cgroup can be considered as an extended mem_map.
@@ -12,9 +12,11 @@
  */
 struct page_cgroup {
 	unsigned long flags;
-	struct mem_cgroup *mem_cgroup;
 	struct page *page;
+#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+	struct mem_cgroup *mem_cgroup;
 	struct list_head lru;		/* per cgroup LRU list */
+#endif
 };
 
 void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat);
@@ -71,7 +73,7 @@ static inline void unlock_page_cgroup(st
 	bit_spin_unlock(PCG_LOCK, &pc->flags);
 }
 
-#else /* CONFIG_CGROUP_MEM_RES_CTLR */
+#else /* CONFIG_CGROUP_PAGE */
 struct page_cgroup;
 
 static inline void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat)
Index: linux-2.6.30-rc3-git3/init/Kconfig
===================================================================
--- linux-2.6.30-rc3-git3.orig/init/Kconfig
+++ linux-2.6.30-rc3-git3/init/Kconfig
@@ -608,6 +608,10 @@ config CGROUP_MEM_RES_CTLR_SWAP
 
 endif # CGROUPS
 
+config CGROUP_PAGE
+	def_bool y
+	depends on CGROUP_MEM_RES_CTLR
+
 config MM_OWNER
 	bool
 
Index: linux-2.6.30-rc3-git3/mm/Makefile
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/Makefile
+++ linux-2.6.30-rc3-git3/mm/Makefile
@@ -37,4 +37,5 @@ else
 obj-$(CONFIG_SMP) += allocpercpu.o
 endif
 obj-$(CONFIG_QUICKLIST) += quicklist.o
-obj-$(CONFIG_CGROUP_MEM_RES_CTLR) += memcontrol.o page_cgroup.o
+obj-$(CONFIG_CGROUP_MEM_RES_CTLR) += memcontrol.o
+obj-$(CONFIG_CGROUP_PAGE) += page_cgroup.o
Index: linux-2.6.30-rc3-git3/mm/memcontrol.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/memcontrol.c
+++ linux-2.6.30-rc3-git3/mm/memcontrol.c
@@ -128,6 +128,12 @@ struct mem_cgroup_lru_info {
 	struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
 };
 
+void __meminit __init_mem_page_cgroup(struct page_cgroup *pc)
+{
+	pc->mem_cgroup = NULL;
+	INIT_LIST_HEAD(&pc->lru);
+}
+
 /*
  * The memory controller data structure. The memory controller controls both
  * page cache and RSS per cgroup. We would eventually like to provide
Index: linux-2.6.30-rc3-git3/mm/page_cgroup.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/page_cgroup.c
+++ linux-2.6.30-rc3-git3/mm/page_cgroup.c
@@ -14,9 +14,8 @@ static void __meminit
 __init_page_cgroup(struct page_cgroup *pc, unsigned long pfn)
 {
 	pc->flags = 0;
-	pc->mem_cgroup = NULL;
 	pc->page = pfn_to_page(pfn);
-	INIT_LIST_HEAD(&pc->lru);
+	__init_mem_page_cgroup(pc);
 }
 static unsigned long total_usage;
 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 2/7] blkio-cgroup: The new page_cgroup framework
  2009-04-28 12:44 [PATCH 1/7] blkio-cgroup: Introduction Ryo Tsuruta
@ 2009-04-28 12:44 ` Ryo Tsuruta
  2009-04-28 12:44 ` Ryo Tsuruta
       [not found] ` <20090428.214423.193684619.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
  2 siblings, 0 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-04-28 12:44 UTC (permalink / raw)
  To: linux-kernel, dm-devel, containers, virtualization, xen-devel

This patch makes the page_cgroup framework be able to be used even if
the compile option of the cgroup memory controller is off.
So blkio-cgroup can use this framework without the memory controller.

Based on 2.6.30-rc3-git3
Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
Signed-off-by: Ryo Tsuruta <ryov@valinux.co.jp>

---
 include/linux/memcontrol.h  |    6 ++++++
 include/linux/mmzone.h      |    4 ++--
 include/linux/page_cgroup.h |    8 +++++---
 init/Kconfig                |    4 ++++
 mm/Makefile                 |    3 ++-
 mm/memcontrol.c             |    6 ++++++
 mm/page_cgroup.c            |    3 +--
 7 files changed, 26 insertions(+), 8 deletions(-)

Index: linux-2.6.30-rc3-git3/include/linux/memcontrol.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/memcontrol.h
+++ linux-2.6.30-rc3-git3/include/linux/memcontrol.h
@@ -37,6 +37,8 @@ struct mm_struct;
  * (Of course, if memcg does memory allocation in future, GFP_KERNEL is sane.)
  */
 
+extern void __init_mem_page_cgroup(struct page_cgroup *pc);
+
 extern int mem_cgroup_newpage_charge(struct page *page, struct mm_struct *mm,
 				gfp_t gfp_mask);
 /* for swap handling */
@@ -120,6 +122,10 @@ extern bool mem_cgroup_oom_called(struct
 #else /* CONFIG_CGROUP_MEM_RES_CTLR */
 struct mem_cgroup;
 
+static inline void __init_mem_page_cgroup(struct page_cgroup *pc)
+{
+}
+
 static inline int mem_cgroup_newpage_charge(struct page *page,
 					struct mm_struct *mm, gfp_t gfp_mask)
 {
Index: linux-2.6.30-rc3-git3/include/linux/mmzone.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/mmzone.h
+++ linux-2.6.30-rc3-git3/include/linux/mmzone.h
@@ -607,7 +607,7 @@ typedef struct pglist_data {
 	int nr_zones;
 #ifdef CONFIG_FLAT_NODE_MEM_MAP	/* means !SPARSEMEM */
 	struct page *node_mem_map;
-#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+#ifdef CONFIG_CGROUP_PAGE
 	struct page_cgroup *node_page_cgroup;
 #endif
 #endif
@@ -958,7 +958,7 @@ struct mem_section {
 
 	/* See declaration of similar field in struct zone */
 	unsigned long *pageblock_flags;
-#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+#ifdef CONFIG_CGROUP_PAGE
 	/*
 	 * If !SPARSEMEM, pgdat doesn't have page_cgroup pointer. We use
 	 * section. (see memcontrol.h/page_cgroup.h about this.)
Index: linux-2.6.30-rc3-git3/include/linux/page_cgroup.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/page_cgroup.h
+++ linux-2.6.30-rc3-git3/include/linux/page_cgroup.h
@@ -1,7 +1,7 @@
 #ifndef __LINUX_PAGE_CGROUP_H
 #define __LINUX_PAGE_CGROUP_H
 
-#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+#ifdef CONFIG_CGROUP_PAGE
 #include <linux/bit_spinlock.h>
 /*
  * Page Cgroup can be considered as an extended mem_map.
@@ -12,9 +12,11 @@
  */
 struct page_cgroup {
 	unsigned long flags;
-	struct mem_cgroup *mem_cgroup;
 	struct page *page;
+#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+	struct mem_cgroup *mem_cgroup;
 	struct list_head lru;		/* per cgroup LRU list */
+#endif
 };
 
 void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat);
@@ -71,7 +73,7 @@ static inline void unlock_page_cgroup(st
 	bit_spin_unlock(PCG_LOCK, &pc->flags);
 }
 
-#else /* CONFIG_CGROUP_MEM_RES_CTLR */
+#else /* CONFIG_CGROUP_PAGE */
 struct page_cgroup;
 
 static inline void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat)
Index: linux-2.6.30-rc3-git3/init/Kconfig
===================================================================
--- linux-2.6.30-rc3-git3.orig/init/Kconfig
+++ linux-2.6.30-rc3-git3/init/Kconfig
@@ -608,6 +608,10 @@ config CGROUP_MEM_RES_CTLR_SWAP
 
 endif # CGROUPS
 
+config CGROUP_PAGE
+	def_bool y
+	depends on CGROUP_MEM_RES_CTLR
+
 config MM_OWNER
 	bool
 
Index: linux-2.6.30-rc3-git3/mm/Makefile
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/Makefile
+++ linux-2.6.30-rc3-git3/mm/Makefile
@@ -37,4 +37,5 @@ else
 obj-$(CONFIG_SMP) += allocpercpu.o
 endif
 obj-$(CONFIG_QUICKLIST) += quicklist.o
-obj-$(CONFIG_CGROUP_MEM_RES_CTLR) += memcontrol.o page_cgroup.o
+obj-$(CONFIG_CGROUP_MEM_RES_CTLR) += memcontrol.o
+obj-$(CONFIG_CGROUP_PAGE) += page_cgroup.o
Index: linux-2.6.30-rc3-git3/mm/memcontrol.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/memcontrol.c
+++ linux-2.6.30-rc3-git3/mm/memcontrol.c
@@ -128,6 +128,12 @@ struct mem_cgroup_lru_info {
 	struct mem_cgroup_per_node *nodeinfo[MAX_NUMNODES];
 };
 
+void __meminit __init_mem_page_cgroup(struct page_cgroup *pc)
+{
+	pc->mem_cgroup = NULL;
+	INIT_LIST_HEAD(&pc->lru);
+}
+
 /*
  * The memory controller data structure. The memory controller controls both
  * page cache and RSS per cgroup. We would eventually like to provide
Index: linux-2.6.30-rc3-git3/mm/page_cgroup.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/page_cgroup.c
+++ linux-2.6.30-rc3-git3/mm/page_cgroup.c
@@ -14,9 +14,8 @@ static void __meminit
 __init_page_cgroup(struct page_cgroup *pc, unsigned long pfn)
 {
 	pc->flags = 0;
-	pc->mem_cgroup = NULL;
 	pc->page = pfn_to_page(pfn);
-	INIT_LIST_HEAD(&pc->lru);
+	__init_mem_page_cgroup(pc);
 }
 static unsigned long total_usage;

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 3/7] blkio-cgroup: Refactoring io-context initialization
       [not found]   ` <20090428.214457.226801820.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
@ 2009-04-28 12:45     ` Ryo Tsuruta
  0 siblings, 0 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-04-28 12:45 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	xen-devel-GuqFBffKawuULHF6PoxzQEEOCMrvLtNR

This patch refactors io_context initialization.

Based on 2.6.30-rc3-git3
Signed-off-by: Hirokazu Takahashi <taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
Signed-off-by: Ryo Tsuruta <ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>

---
 block/blk-ioc.c           |   30 +++++++++++++++++-------------
 include/linux/iocontext.h |    1 +
 2 files changed, 18 insertions(+), 13 deletions(-)

Index: linux-2.6.30-rc3-git3/block/blk-ioc.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/block/blk-ioc.c
+++ linux-2.6.30-rc3-git3/block/blk-ioc.c
@@ -84,24 +84,28 @@ void exit_io_context(void)
 	}
 }
 
+void init_io_context(struct io_context *ioc)
+{
+	atomic_set(&ioc->refcount, 1);
+	atomic_set(&ioc->nr_tasks, 1);
+	spin_lock_init(&ioc->lock);
+	ioc->ioprio_changed = 0;
+	ioc->ioprio = 0;
+	ioc->last_waited = jiffies; /* doesn't matter... */
+	ioc->nr_batch_requests = 0; /* because this is 0 */
+	ioc->aic = NULL;
+	INIT_RADIX_TREE(&ioc->radix_root, GFP_ATOMIC | __GFP_HIGH);
+	INIT_HLIST_HEAD(&ioc->cic_list);
+	ioc->ioc_data = NULL;
+}
+
 struct io_context *alloc_io_context(gfp_t gfp_flags, int node)
 {
 	struct io_context *ret;
 
 	ret = kmem_cache_alloc_node(iocontext_cachep, gfp_flags, node);
-	if (ret) {
-		atomic_set(&ret->refcount, 1);
-		atomic_set(&ret->nr_tasks, 1);
-		spin_lock_init(&ret->lock);
-		ret->ioprio_changed = 0;
-		ret->ioprio = 0;
-		ret->last_waited = jiffies; /* doesn't matter... */
-		ret->nr_batch_requests = 0; /* because this is 0 */
-		ret->aic = NULL;
-		INIT_RADIX_TREE(&ret->radix_root, GFP_ATOMIC | __GFP_HIGH);
-		INIT_HLIST_HEAD(&ret->cic_list);
-		ret->ioc_data = NULL;
-	}
+	if (ret)
+		init_io_context(ret);
 
 	return ret;
 }
Index: linux-2.6.30-rc3-git3/include/linux/iocontext.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/iocontext.h
+++ linux-2.6.30-rc3-git3/include/linux/iocontext.h
@@ -104,6 +104,7 @@ int put_io_context(struct io_context *io
 void exit_io_context(void);
 struct io_context *get_io_context(gfp_t gfp_flags, int node);
 struct io_context *alloc_io_context(gfp_t gfp_flags, int node);
+void init_io_context(struct io_context *ioc);
 void copy_io_context(struct io_context **pdst, struct io_context **psrc);
 #else
 static inline void exit_io_context(void)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 3/7] blkio-cgroup: Refactoring io-context initialization
  2009-04-28 12:44 ` Ryo Tsuruta
       [not found]   ` <20090428.214457.226801820.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
@ 2009-04-28 12:45   ` Ryo Tsuruta
  2009-04-28 12:46     ` [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup Ryo Tsuruta
                       ` (2 more replies)
  2009-04-28 12:45   ` [PATCH 3/7] blkio-cgroup: Refactoring io-context initialization Ryo Tsuruta
  2 siblings, 3 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-04-28 12:45 UTC (permalink / raw)
  To: linux-kernel, dm-devel, containers, virtualization, xen-devel

This patch refactors io_context initialization.

Based on 2.6.30-rc3-git3
Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
Signed-off-by: Ryo Tsuruta <ryov@valinux.co.jp>

---
 block/blk-ioc.c           |   30 +++++++++++++++++-------------
 include/linux/iocontext.h |    1 +
 2 files changed, 18 insertions(+), 13 deletions(-)

Index: linux-2.6.30-rc3-git3/block/blk-ioc.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/block/blk-ioc.c
+++ linux-2.6.30-rc3-git3/block/blk-ioc.c
@@ -84,24 +84,28 @@ void exit_io_context(void)
 	}
 }
 
+void init_io_context(struct io_context *ioc)
+{
+	atomic_set(&ioc->refcount, 1);
+	atomic_set(&ioc->nr_tasks, 1);
+	spin_lock_init(&ioc->lock);
+	ioc->ioprio_changed = 0;
+	ioc->ioprio = 0;
+	ioc->last_waited = jiffies; /* doesn't matter... */
+	ioc->nr_batch_requests = 0; /* because this is 0 */
+	ioc->aic = NULL;
+	INIT_RADIX_TREE(&ioc->radix_root, GFP_ATOMIC | __GFP_HIGH);
+	INIT_HLIST_HEAD(&ioc->cic_list);
+	ioc->ioc_data = NULL;
+}
+
 struct io_context *alloc_io_context(gfp_t gfp_flags, int node)
 {
 	struct io_context *ret;
 
 	ret = kmem_cache_alloc_node(iocontext_cachep, gfp_flags, node);
-	if (ret) {
-		atomic_set(&ret->refcount, 1);
-		atomic_set(&ret->nr_tasks, 1);
-		spin_lock_init(&ret->lock);
-		ret->ioprio_changed = 0;
-		ret->ioprio = 0;
-		ret->last_waited = jiffies; /* doesn't matter... */
-		ret->nr_batch_requests = 0; /* because this is 0 */
-		ret->aic = NULL;
-		INIT_RADIX_TREE(&ret->radix_root, GFP_ATOMIC | __GFP_HIGH);
-		INIT_HLIST_HEAD(&ret->cic_list);
-		ret->ioc_data = NULL;
-	}
+	if (ret)
+		init_io_context(ret);
 
 	return ret;
 }
Index: linux-2.6.30-rc3-git3/include/linux/iocontext.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/iocontext.h
+++ linux-2.6.30-rc3-git3/include/linux/iocontext.h
@@ -104,6 +104,7 @@ int put_io_context(struct io_context *io
 void exit_io_context(void);
 struct io_context *get_io_context(gfp_t gfp_flags, int node);
 struct io_context *alloc_io_context(gfp_t gfp_flags, int node);
+void init_io_context(struct io_context *ioc);
 void copy_io_context(struct io_context **pdst, struct io_context **psrc);
 #else
 static inline void exit_io_context(void)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 3/7] blkio-cgroup: Refactoring io-context initialization
  2009-04-28 12:44 ` Ryo Tsuruta
       [not found]   ` <20090428.214457.226801820.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
  2009-04-28 12:45   ` Ryo Tsuruta
@ 2009-04-28 12:45   ` Ryo Tsuruta
  2 siblings, 0 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-04-28 12:45 UTC (permalink / raw)
  To: linux-kernel, dm-devel, containers, virtualization, xen-devel

This patch refactors io_context initialization.

Based on 2.6.30-rc3-git3
Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
Signed-off-by: Ryo Tsuruta <ryov@valinux.co.jp>

---
 block/blk-ioc.c           |   30 +++++++++++++++++-------------
 include/linux/iocontext.h |    1 +
 2 files changed, 18 insertions(+), 13 deletions(-)

Index: linux-2.6.30-rc3-git3/block/blk-ioc.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/block/blk-ioc.c
+++ linux-2.6.30-rc3-git3/block/blk-ioc.c
@@ -84,24 +84,28 @@ void exit_io_context(void)
 	}
 }
 
+void init_io_context(struct io_context *ioc)
+{
+	atomic_set(&ioc->refcount, 1);
+	atomic_set(&ioc->nr_tasks, 1);
+	spin_lock_init(&ioc->lock);
+	ioc->ioprio_changed = 0;
+	ioc->ioprio = 0;
+	ioc->last_waited = jiffies; /* doesn't matter... */
+	ioc->nr_batch_requests = 0; /* because this is 0 */
+	ioc->aic = NULL;
+	INIT_RADIX_TREE(&ioc->radix_root, GFP_ATOMIC | __GFP_HIGH);
+	INIT_HLIST_HEAD(&ioc->cic_list);
+	ioc->ioc_data = NULL;
+}
+
 struct io_context *alloc_io_context(gfp_t gfp_flags, int node)
 {
 	struct io_context *ret;
 
 	ret = kmem_cache_alloc_node(iocontext_cachep, gfp_flags, node);
-	if (ret) {
-		atomic_set(&ret->refcount, 1);
-		atomic_set(&ret->nr_tasks, 1);
-		spin_lock_init(&ret->lock);
-		ret->ioprio_changed = 0;
-		ret->ioprio = 0;
-		ret->last_waited = jiffies; /* doesn't matter... */
-		ret->nr_batch_requests = 0; /* because this is 0 */
-		ret->aic = NULL;
-		INIT_RADIX_TREE(&ret->radix_root, GFP_ATOMIC | __GFP_HIGH);
-		INIT_HLIST_HEAD(&ret->cic_list);
-		ret->ioc_data = NULL;
-	}
+	if (ret)
+		init_io_context(ret);
 
 	return ret;
 }
Index: linux-2.6.30-rc3-git3/include/linux/iocontext.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/iocontext.h
+++ linux-2.6.30-rc3-git3/include/linux/iocontext.h
@@ -104,6 +104,7 @@ int put_io_context(struct io_context *io
 void exit_io_context(void);
 struct io_context *get_io_context(gfp_t gfp_flags, int node);
 struct io_context *alloc_io_context(gfp_t gfp_flags, int node);
+void init_io_context(struct io_context *ioc);
 void copy_io_context(struct io_context **pdst, struct io_context **psrc);
 #else
 static inline void exit_io_context(void)

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup
       [not found]     ` <20090428.214535.71090234.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
@ 2009-04-28 12:46       ` Ryo Tsuruta
  0 siblings, 0 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-04-28 12:46 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	xen-devel-GuqFBffKawuULHF6PoxzQEEOCMrvLtNR

The body of bio-cgroup.

Based on 2.6.30-rc3-git3
Signed-off-by: Hirokazu Takahashi <taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
Signed-off-by: Ryo Tsuruta <ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>

---
 include/linux/biotrack.h      |   97 +++++++++++++
 include/linux/cgroup_subsys.h |    6 
 include/linux/page_cgroup.h   |   23 +++
 init/Kconfig                  |   13 +
 mm/Makefile                   |    1 
 mm/biotrack.c                 |  300 ++++++++++++++++++++++++++++++++++++++++++
 mm/page_cgroup.c              |   14 +
 7 files changed, 447 insertions(+), 7 deletions(-)

Index: linux-2.6.30-rc3-git3/include/linux/biotrack.h
===================================================================
--- /dev/null
+++ linux-2.6.30-rc3-git3/include/linux/biotrack.h
@@ -0,0 +1,97 @@
+#include <linux/cgroup.h>
+#include <linux/mm.h>
+#include <linux/page_cgroup.h>
+
+#ifndef _LINUX_BIOTRACK_H
+#define _LINUX_BIOTRACK_H
+
+#ifdef	CONFIG_CGROUP_BLKIO
+
+struct io_context;
+struct block_device;
+
+struct blkio_cgroup {
+	struct cgroup_subsys_state css;
+	struct io_context *io_context;	/* default io_context */
+/*	struct radix_tree_root io_context_root; per device io_context */
+};
+
+/**
+ * __init_blkio_page_cgroup() - initialize a blkio_page_cgroup
+ * @pc:		page_cgroup of the page
+ *
+ * Reset the owner ID of a page.
+ */
+static inline void __init_blkio_page_cgroup(struct page_cgroup *pc)
+{
+	lock_page_cgroup(pc);
+	page_cgroup_set_id(pc, 0);
+	unlock_page_cgroup(pc);
+}
+
+/**
+ * blkio_cgroup_disabled - check whether blkio_cgroup is disabled
+ *
+ * Returns true if disabled, false if not.
+ */
+static inline bool blkio_cgroup_disabled(void)
+{
+	if (blkio_cgroup_subsys.disabled)
+		return true;
+	return false;
+}
+
+extern void blkio_cgroup_set_owner(struct page *page, struct mm_struct *mm);
+extern void blkio_cgroup_reset_owner(struct page *page, struct mm_struct *mm);
+extern void blkio_cgroup_reset_owner_pagedirty(struct page *page,
+						 struct mm_struct *mm);
+extern void blkio_cgroup_copy_owner(struct page *page, struct page *opage);
+
+extern struct io_context *get_blkio_cgroup_iocontext(struct bio *bio);
+extern unsigned long get_blkio_cgroup_id(struct bio *bio);
+extern struct cgroup *blkio_cgroup_lookup(int id);
+
+#else	/* CONFIG_CGROUP_BIO */
+
+struct blkio_cgroup;
+
+static inline void __init_blkio_page_cgroup(struct page_cgroup *pc)
+{
+}
+
+static inline bool blkio_cgroup_disabled(void)
+{
+	return true;
+}
+
+static inline void blkio_cgroup_set_owner(struct page *page, struct mm_struct *mm)
+{
+}
+
+static inline void blkio_cgroup_reset_owner(struct page *page,
+						struct mm_struct *mm)
+{
+}
+
+static inline void blkio_cgroup_reset_owner_pagedirty(struct page *page,
+						struct mm_struct *mm)
+{
+}
+
+static inline void blkio_cgroup_copy_owner(struct page *page, struct page *opage)
+{
+}
+
+static inline struct io_context *get_blkio_cgroup_iocontext(struct bio *bio)
+{
+	return NULL;
+}
+
+static inline unsigned long get_blkio_cgroup_id(struct bio *bio)
+{
+	return 0;
+}
+
+#endif	/* CONFIG_CGROUP_BLKIO */
+
+#endif /* _LINUX_BIOTRACK_H */
Index: linux-2.6.30-rc3-git3/include/linux/cgroup_subsys.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/cgroup_subsys.h
+++ linux-2.6.30-rc3-git3/include/linux/cgroup_subsys.h
@@ -43,6 +43,12 @@ SUBSYS(mem_cgroup)
 
 /* */
 
+#ifdef CONFIG_CGROUP_BLKIO
+SUBSYS(blkio_cgroup)
+#endif
+
+/* */
+
 #ifdef CONFIG_CGROUP_DEVICE
 SUBSYS(devices)
 #endif
Index: linux-2.6.30-rc3-git3/include/linux/page_cgroup.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/page_cgroup.h
+++ linux-2.6.30-rc3-git3/include/linux/page_cgroup.h
@@ -124,4 +124,27 @@ static inline void swap_cgroup_swapoff(i
 }
 
 #endif
+
+#ifdef CONFIG_CGROUP_BLKIO
+/*
+ * use lower 16 bits for flags and reserve the rest for the page tracking id
+ */
+#define PCG_TRACKING_ID_SHIFT	(16)
+#define PCG_TRACKING_ID_BITS \
+	(8 * sizeof(unsigned long) - PCG_TRACKING_ID_SHIFT)
+
+/* NOTE: must be called with page_cgroup() held */
+static inline unsigned long page_cgroup_get_id(struct page_cgroup *pc)
+{
+	return pc->flags >> PCG_TRACKING_ID_SHIFT;
+}
+
+/* NOTE: must be called with page_cgroup() held */
+static inline void page_cgroup_set_id(struct page_cgroup *pc, unsigned long id)
+{
+	WARN_ON(id >= (1UL << PCG_TRACKING_ID_BITS));
+	pc->flags &= (1UL << PCG_TRACKING_ID_SHIFT) - 1;
+	pc->flags |= (unsigned long)(id << PCG_TRACKING_ID_SHIFT);
+}
+#endif
 #endif
Index: linux-2.6.30-rc3-git3/init/Kconfig
===================================================================
--- linux-2.6.30-rc3-git3.orig/init/Kconfig
+++ linux-2.6.30-rc3-git3/init/Kconfig
@@ -608,9 +608,20 @@ config CGROUP_MEM_RES_CTLR_SWAP
 
 endif # CGROUPS
 
+config CGROUP_BLKIO
+	bool "Block I/O cgroup subsystem"
+	depends on CGROUPS && BLOCK
+	select MM_OWNER
+	help
+	  Provides a Resource Controller which enables to track the onwner
+	  of every Block I/O requests.
+	  The information this subsystem provides can be used from any
+	  kind of module such as dm-ioband device mapper modules or
+	  the cfq-scheduler.
+
 config CGROUP_PAGE
 	def_bool y
-	depends on CGROUP_MEM_RES_CTLR
+	depends on CGROUP_MEM_RES_CTLR || CGROUP_BLKIO
 
 config MM_OWNER
 	bool
Index: linux-2.6.30-rc3-git3/mm/biotrack.c
===================================================================
--- /dev/null
+++ linux-2.6.30-rc3-git3/mm/biotrack.c
@@ -0,0 +1,300 @@
+/* biotrack.c - Block I/O Tracking
+ *
+ * Copyright (C) VA Linux Systems Japan, 2008-2009
+ * Developed by Hirokazu Takahashi <taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
+ *
+ * Copyright (C) 2008 Andrea Righi <righi.andrea-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
+ * Use part of page_cgroup->flags to store blkio-cgroup ID.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/module.h>
+#include <linux/smp.h>
+#include <linux/bit_spinlock.h>
+#include <linux/blkdev.h>
+#include <linux/biotrack.h>
+#include <linux/mm_inline.h>
+
+/*
+ * The block I/O tracking mechanism is implemented on the cgroup memory
+ * controller framework. It helps to find the the owner of an I/O request
+ * because every I/O request has a target page and the owner of the page
+ * can be easily determined on the framework.
+ */
+
+/* Return the blkio_cgroup that associates with a cgroup. */
+static inline struct blkio_cgroup *cgroup_blkio(struct cgroup *cgrp)
+{
+	return container_of(cgroup_subsys_state(cgrp, blkio_cgroup_subsys_id),
+					struct blkio_cgroup, css);
+}
+
+/* Return the blkio_cgroup that associates with a process. */
+static inline struct blkio_cgroup *blkio_cgroup_from_task(struct task_struct *p)
+{
+	return container_of(task_subsys_state(p, blkio_cgroup_subsys_id),
+					struct blkio_cgroup, css);
+}
+
+static struct io_context default_blkio_io_context;
+static struct blkio_cgroup default_blkio_cgroup = {
+	.io_context	= &default_blkio_io_context,
+};
+
+/**
+ * blkio_cgroup_set_owner() - set the owner ID of a page.
+ * @page:	the page we want to tag
+ * @mm:		the mm_struct of a page owner
+ *
+ * Make a given page have the blkio-cgroup ID of the owner of this page.
+ */
+void blkio_cgroup_set_owner(struct page *page, struct mm_struct *mm)
+{
+	struct blkio_cgroup *biog;
+	struct page_cgroup *pc;
+	unsigned long id;
+
+	if (blkio_cgroup_disabled())
+		return;
+	pc = lookup_page_cgroup(page);
+	if (unlikely(!pc))
+		return;
+
+	lock_page_cgroup(pc);
+	page_cgroup_set_id(pc, 0);	/* 0: default blkio_cgroup id */
+	unlock_page_cgroup(pc);
+	if (!mm)
+		return;
+
+	rcu_read_lock();
+	biog = blkio_cgroup_from_task(rcu_dereference(mm->owner));
+	if (unlikely(!biog)) {
+		rcu_read_unlock();
+		return;
+	}
+	/*
+	 * css_get(&bio->css) isn't called to increment the reference
+	 * count of this blkio_cgroup "biog" so the css_id might turn
+	 * invalid even if this page is still active.
+	 * This approach is chosen to minimize the overhead.
+	 */
+	id = css_id(&biog->css);
+	rcu_read_unlock();
+	lock_page_cgroup(pc);
+	page_cgroup_set_id(pc, id);
+	unlock_page_cgroup(pc);
+}
+
+/**
+ * blkio_cgroup_reset_owner() - reset the owner ID of a page
+ * @page:	the page we want to tag
+ * @mm:		the mm_struct of a page owner
+ *
+ * Change the owner of a given page if necessary.
+ */
+void blkio_cgroup_reset_owner(struct page *page, struct mm_struct *mm)
+{
+	blkio_cgroup_set_owner(page, mm);
+}
+
+/**
+ * blkio_cgroup_reset_owner_pagedirty() - reset the owner ID of a pagecache page
+ * @page:	the page we want to tag
+ * @mm:		the mm_struct of a page owner
+ *
+ * Change the owner of a given page if the page is in the pagecache.
+ */
+void blkio_cgroup_reset_owner_pagedirty(struct page *page, struct mm_struct *mm)
+{
+	if (!page_is_file_cache(page))
+		return;
+	if (current->flags & PF_MEMALLOC)
+		return;
+
+	blkio_cgroup_reset_owner(page, mm);
+}
+
+/**
+ * blkio_cgroup_copy_owner() - copy the owner ID of a page into another page
+ * @npage:	the page where we want to copy the owner
+ * @opage:	the page from which we want to copy the ID
+ *
+ * Copy the owner ID of @opage into @npage.
+ */
+void blkio_cgroup_copy_owner(struct page *npage, struct page *opage)
+{
+	struct page_cgroup *npc, *opc;
+	unsigned long id;
+
+	if (blkio_cgroup_disabled())
+		return;
+	npc = lookup_page_cgroup(npage);
+	if (unlikely(!npc))
+		return;
+	opc = lookup_page_cgroup(opage);
+	if (unlikely(!opc))
+		return;
+
+	lock_page_cgroup(opc);
+	lock_page_cgroup(npc);
+	id = page_cgroup_get_id(opc);
+	page_cgroup_set_id(npc, id);
+	unlock_page_cgroup(npc);
+	unlock_page_cgroup(opc);
+}
+
+/* Create a new blkio-cgroup. */
+static struct cgroup_subsys_state *
+blkio_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cgrp)
+{
+	struct blkio_cgroup *biog;
+	struct io_context *ioc;
+
+	if (!cgrp->parent) {
+		biog = &default_blkio_cgroup;
+		init_io_context(biog->io_context);
+		/* Increment the referrence count not to be released ever. */
+		atomic_inc(&biog->io_context->refcount);
+		return &biog->css;
+	}
+
+	biog = kzalloc(sizeof(*biog), GFP_KERNEL);
+	if (!biog)
+		return ERR_PTR(-ENOMEM);
+	ioc = alloc_io_context(GFP_KERNEL, -1);
+	if (!ioc) {
+		kfree(biog);
+		return ERR_PTR(-ENOMEM);
+	}
+	biog->io_context = ioc;
+	return &biog->css;
+}
+
+/* Delete the blkio-cgroup. */
+static void blkio_cgroup_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp)
+{
+	struct blkio_cgroup *biog = cgroup_blkio(cgrp);
+
+	put_io_context(biog->io_context);
+	free_css_id(&blkio_cgroup_subsys, &biog->css);
+	kfree(biog);
+}
+
+/**
+ * get_blkio_cgroup_id() - determine the blkio-cgroup ID
+ * @bio:	the &struct bio which describes the I/O
+ *
+ * Returns the blkio-cgroup ID of a given bio. A return value zero
+ * means that the page associated with the bio belongs to default_blkio_cgroup.
+ */
+unsigned long get_blkio_cgroup_id(struct bio *bio)
+{
+	struct page_cgroup *pc;
+	struct page *page = bio_iovec_idx(bio, 0)->bv_page;
+	unsigned long id = 0;
+
+	pc = lookup_page_cgroup(page);
+	if (pc) {
+		lock_page_cgroup(pc);
+		id = page_cgroup_get_id(pc);
+		unlock_page_cgroup(pc);
+	}
+	return id;
+}
+
+/**
+ * get_blkio_cgroup_iocontext() - determine the blkio-cgroup iocontext
+ * @bio:	the &struct bio which describe the I/O
+ *
+ * Returns the iocontext of blkio-cgroup that issued a given bio.
+ */
+struct io_context *get_blkio_cgroup_iocontext(struct bio *bio)
+{
+	struct cgroup_subsys_state *css;
+	struct blkio_cgroup *biog;
+	struct io_context *ioc;
+	unsigned long id;
+
+	id = get_blkio_cgroup_id(bio);
+	rcu_read_lock();
+	css = css_lookup(&blkio_cgroup_subsys, id);
+	if (css)
+		biog = container_of(css, struct blkio_cgroup, css);
+	else
+		biog = &default_blkio_cgroup;
+	ioc = biog->io_context;	/* default io_context for this cgroup */
+	atomic_inc(&ioc->refcount);
+	rcu_read_unlock();
+	return ioc;
+}
+
+/**
+ * blkio_cgroup_lookup() - lookup a cgroup by blkio-cgroup ID
+ * @id:		blkio-cgroup ID
+ *
+ * Returns the cgroup associated with the specified ID, or NULL if lookup
+ * fails.
+ *
+ * Note:
+ * This function should be called under rcu_read_lock().
+ */
+struct cgroup *blkio_cgroup_lookup(int id)
+{
+	struct cgroup *cgrp;
+	struct cgroup_subsys_state *css;
+
+	if (blkio_cgroup_disabled())
+		return NULL;
+
+	css = css_lookup(&blkio_cgroup_subsys, id);
+	if (!css)
+		return NULL;
+	cgrp = css->cgroup;
+	return cgrp;
+}
+EXPORT_SYMBOL(get_blkio_cgroup_iocontext);
+EXPORT_SYMBOL(get_blkio_cgroup_id);
+EXPORT_SYMBOL(blkio_cgroup_lookup);
+
+static u64 blkio_id_read(struct cgroup *cgrp, struct cftype *cft)
+{
+	struct blkio_cgroup *biog = cgroup_blkio(cgrp);
+	unsigned long id;
+
+	rcu_read_lock();
+	id = css_id(&biog->css);
+	rcu_read_unlock();
+	return (u64)id;
+}
+
+
+static struct cftype blkio_files[] = {
+	{
+		.name = "id",
+		.read_u64 = blkio_id_read,
+	},
+};
+
+static int blkio_cgroup_populate(struct cgroup_subsys *ss, struct cgroup *cgrp)
+{
+	return cgroup_add_files(cgrp, ss, blkio_files,
+					ARRAY_SIZE(blkio_files));
+}
+
+struct cgroup_subsys blkio_cgroup_subsys = {
+	.name		= "blkio",
+	.create		= blkio_cgroup_create,
+	.destroy	= blkio_cgroup_destroy,
+	.populate	= blkio_cgroup_populate,
+	.subsys_id	= blkio_cgroup_subsys_id,
+	.use_id		= 1,
+};
Index: linux-2.6.30-rc3-git3/mm/page_cgroup.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/page_cgroup.c
+++ linux-2.6.30-rc3-git3/mm/page_cgroup.c
@@ -9,6 +9,7 @@
 #include <linux/vmalloc.h>
 #include <linux/cgroup.h>
 #include <linux/swapops.h>
+#include <linux/biotrack.h>
 
 static void __meminit
 __init_page_cgroup(struct page_cgroup *pc, unsigned long pfn)
@@ -16,6 +17,7 @@ __init_page_cgroup(struct page_cgroup *p
 	pc->flags = 0;
 	pc->page = pfn_to_page(pfn);
 	__init_mem_page_cgroup(pc);
+	__init_blkio_page_cgroup(pc);
 }
 static unsigned long total_usage;
 
@@ -73,7 +75,7 @@ void __init page_cgroup_init(void)
 
 	int nid, fail;
 
-	if (mem_cgroup_disabled())
+	if (mem_cgroup_disabled() && blkio_cgroup_disabled())
 		return;
 
 	for_each_online_node(nid)  {
@@ -82,12 +84,12 @@ void __init page_cgroup_init(void)
 			goto fail;
 	}
 	printk(KERN_INFO "allocated %ld bytes of page_cgroup\n", total_usage);
-	printk(KERN_INFO "please try cgroup_disable=memory option if you"
+	printk(KERN_INFO "please try cgroup_disable=memory,blkio option if you"
 	" don't want\n");
 	return;
 fail:
 	printk(KERN_CRIT "allocation of page_cgroup was failed.\n");
-	printk(KERN_CRIT "please try cgroup_disable=memory boot option\n");
+	printk(KERN_CRIT "please try cgroup_disable=memory,blkio boot options\n");
 	panic("Out of memory");
 }
 
@@ -247,7 +249,7 @@ void __init page_cgroup_init(void)
 	unsigned long pfn;
 	int fail = 0;
 
-	if (mem_cgroup_disabled())
+	if (mem_cgroup_disabled() && blkio_cgroup_disabled())
 		return;
 
 	for (pfn = 0; !fail && pfn < max_pfn; pfn += PAGES_PER_SECTION) {
@@ -262,8 +264,8 @@ void __init page_cgroup_init(void)
 		hotplug_memory_notifier(page_cgroup_callback, 0);
 	}
 	printk(KERN_INFO "allocated %ld bytes of page_cgroup\n", total_usage);
-	printk(KERN_INFO "please try cgroup_disable=memory option if you don't"
-	" want\n");
+	printk(KERN_INFO "please try cgroup_disable=memory,blkio option"
+	" if you don't want\n");
 }
 
 void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat)
Index: linux-2.6.30-rc3-git3/mm/Makefile
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/Makefile
+++ linux-2.6.30-rc3-git3/mm/Makefile
@@ -39,3 +39,4 @@ endif
 obj-$(CONFIG_QUICKLIST) += quicklist.o
 obj-$(CONFIG_CGROUP_MEM_RES_CTLR) += memcontrol.o
 obj-$(CONFIG_CGROUP_PAGE) += page_cgroup.o
+obj-$(CONFIG_CGROUP_BLKIO) += biotrack.o

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup
  2009-04-28 12:45   ` Ryo Tsuruta
@ 2009-04-28 12:46     ` Ryo Tsuruta
  2009-04-28 12:46       ` [PATCH 5/7] blkio-cgroup: Page tracking hooks Ryo Tsuruta
                         ` (4 more replies)
  2009-04-28 12:46     ` Ryo Tsuruta
       [not found]     ` <20090428.214535.71090234.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
  2 siblings, 5 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-04-28 12:46 UTC (permalink / raw)
  To: linux-kernel, dm-devel, containers, virtualization, xen-devel

The body of bio-cgroup.

Based on 2.6.30-rc3-git3
Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
Signed-off-by: Ryo Tsuruta <ryov@valinux.co.jp>

---
 include/linux/biotrack.h      |   97 +++++++++++++
 include/linux/cgroup_subsys.h |    6 
 include/linux/page_cgroup.h   |   23 +++
 init/Kconfig                  |   13 +
 mm/Makefile                   |    1 
 mm/biotrack.c                 |  300 ++++++++++++++++++++++++++++++++++++++++++
 mm/page_cgroup.c              |   14 +
 7 files changed, 447 insertions(+), 7 deletions(-)

Index: linux-2.6.30-rc3-git3/include/linux/biotrack.h
===================================================================
--- /dev/null
+++ linux-2.6.30-rc3-git3/include/linux/biotrack.h
@@ -0,0 +1,97 @@
+#include <linux/cgroup.h>
+#include <linux/mm.h>
+#include <linux/page_cgroup.h>
+
+#ifndef _LINUX_BIOTRACK_H
+#define _LINUX_BIOTRACK_H
+
+#ifdef	CONFIG_CGROUP_BLKIO
+
+struct io_context;
+struct block_device;
+
+struct blkio_cgroup {
+	struct cgroup_subsys_state css;
+	struct io_context *io_context;	/* default io_context */
+/*	struct radix_tree_root io_context_root; per device io_context */
+};
+
+/**
+ * __init_blkio_page_cgroup() - initialize a blkio_page_cgroup
+ * @pc:		page_cgroup of the page
+ *
+ * Reset the owner ID of a page.
+ */
+static inline void __init_blkio_page_cgroup(struct page_cgroup *pc)
+{
+	lock_page_cgroup(pc);
+	page_cgroup_set_id(pc, 0);
+	unlock_page_cgroup(pc);
+}
+
+/**
+ * blkio_cgroup_disabled - check whether blkio_cgroup is disabled
+ *
+ * Returns true if disabled, false if not.
+ */
+static inline bool blkio_cgroup_disabled(void)
+{
+	if (blkio_cgroup_subsys.disabled)
+		return true;
+	return false;
+}
+
+extern void blkio_cgroup_set_owner(struct page *page, struct mm_struct *mm);
+extern void blkio_cgroup_reset_owner(struct page *page, struct mm_struct *mm);
+extern void blkio_cgroup_reset_owner_pagedirty(struct page *page,
+						 struct mm_struct *mm);
+extern void blkio_cgroup_copy_owner(struct page *page, struct page *opage);
+
+extern struct io_context *get_blkio_cgroup_iocontext(struct bio *bio);
+extern unsigned long get_blkio_cgroup_id(struct bio *bio);
+extern struct cgroup *blkio_cgroup_lookup(int id);
+
+#else	/* CONFIG_CGROUP_BIO */
+
+struct blkio_cgroup;
+
+static inline void __init_blkio_page_cgroup(struct page_cgroup *pc)
+{
+}
+
+static inline bool blkio_cgroup_disabled(void)
+{
+	return true;
+}
+
+static inline void blkio_cgroup_set_owner(struct page *page, struct mm_struct *mm)
+{
+}
+
+static inline void blkio_cgroup_reset_owner(struct page *page,
+						struct mm_struct *mm)
+{
+}
+
+static inline void blkio_cgroup_reset_owner_pagedirty(struct page *page,
+						struct mm_struct *mm)
+{
+}
+
+static inline void blkio_cgroup_copy_owner(struct page *page, struct page *opage)
+{
+}
+
+static inline struct io_context *get_blkio_cgroup_iocontext(struct bio *bio)
+{
+	return NULL;
+}
+
+static inline unsigned long get_blkio_cgroup_id(struct bio *bio)
+{
+	return 0;
+}
+
+#endif	/* CONFIG_CGROUP_BLKIO */
+
+#endif /* _LINUX_BIOTRACK_H */
Index: linux-2.6.30-rc3-git3/include/linux/cgroup_subsys.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/cgroup_subsys.h
+++ linux-2.6.30-rc3-git3/include/linux/cgroup_subsys.h
@@ -43,6 +43,12 @@ SUBSYS(mem_cgroup)
 
 /* */
 
+#ifdef CONFIG_CGROUP_BLKIO
+SUBSYS(blkio_cgroup)
+#endif
+
+/* */
+
 #ifdef CONFIG_CGROUP_DEVICE
 SUBSYS(devices)
 #endif
Index: linux-2.6.30-rc3-git3/include/linux/page_cgroup.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/page_cgroup.h
+++ linux-2.6.30-rc3-git3/include/linux/page_cgroup.h
@@ -124,4 +124,27 @@ static inline void swap_cgroup_swapoff(i
 }
 
 #endif
+
+#ifdef CONFIG_CGROUP_BLKIO
+/*
+ * use lower 16 bits for flags and reserve the rest for the page tracking id
+ */
+#define PCG_TRACKING_ID_SHIFT	(16)
+#define PCG_TRACKING_ID_BITS \
+	(8 * sizeof(unsigned long) - PCG_TRACKING_ID_SHIFT)
+
+/* NOTE: must be called with page_cgroup() held */
+static inline unsigned long page_cgroup_get_id(struct page_cgroup *pc)
+{
+	return pc->flags >> PCG_TRACKING_ID_SHIFT;
+}
+
+/* NOTE: must be called with page_cgroup() held */
+static inline void page_cgroup_set_id(struct page_cgroup *pc, unsigned long id)
+{
+	WARN_ON(id >= (1UL << PCG_TRACKING_ID_BITS));
+	pc->flags &= (1UL << PCG_TRACKING_ID_SHIFT) - 1;
+	pc->flags |= (unsigned long)(id << PCG_TRACKING_ID_SHIFT);
+}
+#endif
 #endif
Index: linux-2.6.30-rc3-git3/init/Kconfig
===================================================================
--- linux-2.6.30-rc3-git3.orig/init/Kconfig
+++ linux-2.6.30-rc3-git3/init/Kconfig
@@ -608,9 +608,20 @@ config CGROUP_MEM_RES_CTLR_SWAP
 
 endif # CGROUPS
 
+config CGROUP_BLKIO
+	bool "Block I/O cgroup subsystem"
+	depends on CGROUPS && BLOCK
+	select MM_OWNER
+	help
+	  Provides a Resource Controller which enables to track the onwner
+	  of every Block I/O requests.
+	  The information this subsystem provides can be used from any
+	  kind of module such as dm-ioband device mapper modules or
+	  the cfq-scheduler.
+
 config CGROUP_PAGE
 	def_bool y
-	depends on CGROUP_MEM_RES_CTLR
+	depends on CGROUP_MEM_RES_CTLR || CGROUP_BLKIO
 
 config MM_OWNER
 	bool
Index: linux-2.6.30-rc3-git3/mm/biotrack.c
===================================================================
--- /dev/null
+++ linux-2.6.30-rc3-git3/mm/biotrack.c
@@ -0,0 +1,300 @@
+/* biotrack.c - Block I/O Tracking
+ *
+ * Copyright (C) VA Linux Systems Japan, 2008-2009
+ * Developed by Hirokazu Takahashi <taka@valinux.co.jp>
+ *
+ * Copyright (C) 2008 Andrea Righi <righi.andrea@gmail.com>
+ * Use part of page_cgroup->flags to store blkio-cgroup ID.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/module.h>
+#include <linux/smp.h>
+#include <linux/bit_spinlock.h>
+#include <linux/blkdev.h>
+#include <linux/biotrack.h>
+#include <linux/mm_inline.h>
+
+/*
+ * The block I/O tracking mechanism is implemented on the cgroup memory
+ * controller framework. It helps to find the the owner of an I/O request
+ * because every I/O request has a target page and the owner of the page
+ * can be easily determined on the framework.
+ */
+
+/* Return the blkio_cgroup that associates with a cgroup. */
+static inline struct blkio_cgroup *cgroup_blkio(struct cgroup *cgrp)
+{
+	return container_of(cgroup_subsys_state(cgrp, blkio_cgroup_subsys_id),
+					struct blkio_cgroup, css);
+}
+
+/* Return the blkio_cgroup that associates with a process. */
+static inline struct blkio_cgroup *blkio_cgroup_from_task(struct task_struct *p)
+{
+	return container_of(task_subsys_state(p, blkio_cgroup_subsys_id),
+					struct blkio_cgroup, css);
+}
+
+static struct io_context default_blkio_io_context;
+static struct blkio_cgroup default_blkio_cgroup = {
+	.io_context	= &default_blkio_io_context,
+};
+
+/**
+ * blkio_cgroup_set_owner() - set the owner ID of a page.
+ * @page:	the page we want to tag
+ * @mm:		the mm_struct of a page owner
+ *
+ * Make a given page have the blkio-cgroup ID of the owner of this page.
+ */
+void blkio_cgroup_set_owner(struct page *page, struct mm_struct *mm)
+{
+	struct blkio_cgroup *biog;
+	struct page_cgroup *pc;
+	unsigned long id;
+
+	if (blkio_cgroup_disabled())
+		return;
+	pc = lookup_page_cgroup(page);
+	if (unlikely(!pc))
+		return;
+
+	lock_page_cgroup(pc);
+	page_cgroup_set_id(pc, 0);	/* 0: default blkio_cgroup id */
+	unlock_page_cgroup(pc);
+	if (!mm)
+		return;
+
+	rcu_read_lock();
+	biog = blkio_cgroup_from_task(rcu_dereference(mm->owner));
+	if (unlikely(!biog)) {
+		rcu_read_unlock();
+		return;
+	}
+	/*
+	 * css_get(&bio->css) isn't called to increment the reference
+	 * count of this blkio_cgroup "biog" so the css_id might turn
+	 * invalid even if this page is still active.
+	 * This approach is chosen to minimize the overhead.
+	 */
+	id = css_id(&biog->css);
+	rcu_read_unlock();
+	lock_page_cgroup(pc);
+	page_cgroup_set_id(pc, id);
+	unlock_page_cgroup(pc);
+}
+
+/**
+ * blkio_cgroup_reset_owner() - reset the owner ID of a page
+ * @page:	the page we want to tag
+ * @mm:		the mm_struct of a page owner
+ *
+ * Change the owner of a given page if necessary.
+ */
+void blkio_cgroup_reset_owner(struct page *page, struct mm_struct *mm)
+{
+	blkio_cgroup_set_owner(page, mm);
+}
+
+/**
+ * blkio_cgroup_reset_owner_pagedirty() - reset the owner ID of a pagecache page
+ * @page:	the page we want to tag
+ * @mm:		the mm_struct of a page owner
+ *
+ * Change the owner of a given page if the page is in the pagecache.
+ */
+void blkio_cgroup_reset_owner_pagedirty(struct page *page, struct mm_struct *mm)
+{
+	if (!page_is_file_cache(page))
+		return;
+	if (current->flags & PF_MEMALLOC)
+		return;
+
+	blkio_cgroup_reset_owner(page, mm);
+}
+
+/**
+ * blkio_cgroup_copy_owner() - copy the owner ID of a page into another page
+ * @npage:	the page where we want to copy the owner
+ * @opage:	the page from which we want to copy the ID
+ *
+ * Copy the owner ID of @opage into @npage.
+ */
+void blkio_cgroup_copy_owner(struct page *npage, struct page *opage)
+{
+	struct page_cgroup *npc, *opc;
+	unsigned long id;
+
+	if (blkio_cgroup_disabled())
+		return;
+	npc = lookup_page_cgroup(npage);
+	if (unlikely(!npc))
+		return;
+	opc = lookup_page_cgroup(opage);
+	if (unlikely(!opc))
+		return;
+
+	lock_page_cgroup(opc);
+	lock_page_cgroup(npc);
+	id = page_cgroup_get_id(opc);
+	page_cgroup_set_id(npc, id);
+	unlock_page_cgroup(npc);
+	unlock_page_cgroup(opc);
+}
+
+/* Create a new blkio-cgroup. */
+static struct cgroup_subsys_state *
+blkio_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cgrp)
+{
+	struct blkio_cgroup *biog;
+	struct io_context *ioc;
+
+	if (!cgrp->parent) {
+		biog = &default_blkio_cgroup;
+		init_io_context(biog->io_context);
+		/* Increment the referrence count not to be released ever. */
+		atomic_inc(&biog->io_context->refcount);
+		return &biog->css;
+	}
+
+	biog = kzalloc(sizeof(*biog), GFP_KERNEL);
+	if (!biog)
+		return ERR_PTR(-ENOMEM);
+	ioc = alloc_io_context(GFP_KERNEL, -1);
+	if (!ioc) {
+		kfree(biog);
+		return ERR_PTR(-ENOMEM);
+	}
+	biog->io_context = ioc;
+	return &biog->css;
+}
+
+/* Delete the blkio-cgroup. */
+static void blkio_cgroup_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp)
+{
+	struct blkio_cgroup *biog = cgroup_blkio(cgrp);
+
+	put_io_context(biog->io_context);
+	free_css_id(&blkio_cgroup_subsys, &biog->css);
+	kfree(biog);
+}
+
+/**
+ * get_blkio_cgroup_id() - determine the blkio-cgroup ID
+ * @bio:	the &struct bio which describes the I/O
+ *
+ * Returns the blkio-cgroup ID of a given bio. A return value zero
+ * means that the page associated with the bio belongs to default_blkio_cgroup.
+ */
+unsigned long get_blkio_cgroup_id(struct bio *bio)
+{
+	struct page_cgroup *pc;
+	struct page *page = bio_iovec_idx(bio, 0)->bv_page;
+	unsigned long id = 0;
+
+	pc = lookup_page_cgroup(page);
+	if (pc) {
+		lock_page_cgroup(pc);
+		id = page_cgroup_get_id(pc);
+		unlock_page_cgroup(pc);
+	}
+	return id;
+}
+
+/**
+ * get_blkio_cgroup_iocontext() - determine the blkio-cgroup iocontext
+ * @bio:	the &struct bio which describe the I/O
+ *
+ * Returns the iocontext of blkio-cgroup that issued a given bio.
+ */
+struct io_context *get_blkio_cgroup_iocontext(struct bio *bio)
+{
+	struct cgroup_subsys_state *css;
+	struct blkio_cgroup *biog;
+	struct io_context *ioc;
+	unsigned long id;
+
+	id = get_blkio_cgroup_id(bio);
+	rcu_read_lock();
+	css = css_lookup(&blkio_cgroup_subsys, id);
+	if (css)
+		biog = container_of(css, struct blkio_cgroup, css);
+	else
+		biog = &default_blkio_cgroup;
+	ioc = biog->io_context;	/* default io_context for this cgroup */
+	atomic_inc(&ioc->refcount);
+	rcu_read_unlock();
+	return ioc;
+}
+
+/**
+ * blkio_cgroup_lookup() - lookup a cgroup by blkio-cgroup ID
+ * @id:		blkio-cgroup ID
+ *
+ * Returns the cgroup associated with the specified ID, or NULL if lookup
+ * fails.
+ *
+ * Note:
+ * This function should be called under rcu_read_lock().
+ */
+struct cgroup *blkio_cgroup_lookup(int id)
+{
+	struct cgroup *cgrp;
+	struct cgroup_subsys_state *css;
+
+	if (blkio_cgroup_disabled())
+		return NULL;
+
+	css = css_lookup(&blkio_cgroup_subsys, id);
+	if (!css)
+		return NULL;
+	cgrp = css->cgroup;
+	return cgrp;
+}
+EXPORT_SYMBOL(get_blkio_cgroup_iocontext);
+EXPORT_SYMBOL(get_blkio_cgroup_id);
+EXPORT_SYMBOL(blkio_cgroup_lookup);
+
+static u64 blkio_id_read(struct cgroup *cgrp, struct cftype *cft)
+{
+	struct blkio_cgroup *biog = cgroup_blkio(cgrp);
+	unsigned long id;
+
+	rcu_read_lock();
+	id = css_id(&biog->css);
+	rcu_read_unlock();
+	return (u64)id;
+}
+
+
+static struct cftype blkio_files[] = {
+	{
+		.name = "id",
+		.read_u64 = blkio_id_read,
+	},
+};
+
+static int blkio_cgroup_populate(struct cgroup_subsys *ss, struct cgroup *cgrp)
+{
+	return cgroup_add_files(cgrp, ss, blkio_files,
+					ARRAY_SIZE(blkio_files));
+}
+
+struct cgroup_subsys blkio_cgroup_subsys = {
+	.name		= "blkio",
+	.create		= blkio_cgroup_create,
+	.destroy	= blkio_cgroup_destroy,
+	.populate	= blkio_cgroup_populate,
+	.subsys_id	= blkio_cgroup_subsys_id,
+	.use_id		= 1,
+};
Index: linux-2.6.30-rc3-git3/mm/page_cgroup.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/page_cgroup.c
+++ linux-2.6.30-rc3-git3/mm/page_cgroup.c
@@ -9,6 +9,7 @@
 #include <linux/vmalloc.h>
 #include <linux/cgroup.h>
 #include <linux/swapops.h>
+#include <linux/biotrack.h>
 
 static void __meminit
 __init_page_cgroup(struct page_cgroup *pc, unsigned long pfn)
@@ -16,6 +17,7 @@ __init_page_cgroup(struct page_cgroup *p
 	pc->flags = 0;
 	pc->page = pfn_to_page(pfn);
 	__init_mem_page_cgroup(pc);
+	__init_blkio_page_cgroup(pc);
 }
 static unsigned long total_usage;
 
@@ -73,7 +75,7 @@ void __init page_cgroup_init(void)
 
 	int nid, fail;
 
-	if (mem_cgroup_disabled())
+	if (mem_cgroup_disabled() && blkio_cgroup_disabled())
 		return;
 
 	for_each_online_node(nid)  {
@@ -82,12 +84,12 @@ void __init page_cgroup_init(void)
 			goto fail;
 	}
 	printk(KERN_INFO "allocated %ld bytes of page_cgroup\n", total_usage);
-	printk(KERN_INFO "please try cgroup_disable=memory option if you"
+	printk(KERN_INFO "please try cgroup_disable=memory,blkio option if you"
 	" don't want\n");
 	return;
 fail:
 	printk(KERN_CRIT "allocation of page_cgroup was failed.\n");
-	printk(KERN_CRIT "please try cgroup_disable=memory boot option\n");
+	printk(KERN_CRIT "please try cgroup_disable=memory,blkio boot options\n");
 	panic("Out of memory");
 }
 
@@ -247,7 +249,7 @@ void __init page_cgroup_init(void)
 	unsigned long pfn;
 	int fail = 0;
 
-	if (mem_cgroup_disabled())
+	if (mem_cgroup_disabled() && blkio_cgroup_disabled())
 		return;
 
 	for (pfn = 0; !fail && pfn < max_pfn; pfn += PAGES_PER_SECTION) {
@@ -262,8 +264,8 @@ void __init page_cgroup_init(void)
 		hotplug_memory_notifier(page_cgroup_callback, 0);
 	}
 	printk(KERN_INFO "allocated %ld bytes of page_cgroup\n", total_usage);
-	printk(KERN_INFO "please try cgroup_disable=memory option if you don't"
-	" want\n");
+	printk(KERN_INFO "please try cgroup_disable=memory,blkio option"
+	" if you don't want\n");
 }
 
 void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat)
Index: linux-2.6.30-rc3-git3/mm/Makefile
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/Makefile
+++ linux-2.6.30-rc3-git3/mm/Makefile
@@ -39,3 +39,4 @@ endif
 obj-$(CONFIG_QUICKLIST) += quicklist.o
 obj-$(CONFIG_CGROUP_MEM_RES_CTLR) += memcontrol.o
 obj-$(CONFIG_CGROUP_PAGE) += page_cgroup.o
+obj-$(CONFIG_CGROUP_BLKIO) += biotrack.o

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup
  2009-04-28 12:45   ` Ryo Tsuruta
  2009-04-28 12:46     ` [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup Ryo Tsuruta
@ 2009-04-28 12:46     ` Ryo Tsuruta
       [not found]     ` <20090428.214535.71090234.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
  2 siblings, 0 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-04-28 12:46 UTC (permalink / raw)
  To: linux-kernel, dm-devel, containers, virtualization, xen-devel

The body of bio-cgroup.

Based on 2.6.30-rc3-git3
Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
Signed-off-by: Ryo Tsuruta <ryov@valinux.co.jp>

---
 include/linux/biotrack.h      |   97 +++++++++++++
 include/linux/cgroup_subsys.h |    6 
 include/linux/page_cgroup.h   |   23 +++
 init/Kconfig                  |   13 +
 mm/Makefile                   |    1 
 mm/biotrack.c                 |  300 ++++++++++++++++++++++++++++++++++++++++++
 mm/page_cgroup.c              |   14 +
 7 files changed, 447 insertions(+), 7 deletions(-)

Index: linux-2.6.30-rc3-git3/include/linux/biotrack.h
===================================================================
--- /dev/null
+++ linux-2.6.30-rc3-git3/include/linux/biotrack.h
@@ -0,0 +1,97 @@
+#include <linux/cgroup.h>
+#include <linux/mm.h>
+#include <linux/page_cgroup.h>
+
+#ifndef _LINUX_BIOTRACK_H
+#define _LINUX_BIOTRACK_H
+
+#ifdef	CONFIG_CGROUP_BLKIO
+
+struct io_context;
+struct block_device;
+
+struct blkio_cgroup {
+	struct cgroup_subsys_state css;
+	struct io_context *io_context;	/* default io_context */
+/*	struct radix_tree_root io_context_root; per device io_context */
+};
+
+/**
+ * __init_blkio_page_cgroup() - initialize a blkio_page_cgroup
+ * @pc:		page_cgroup of the page
+ *
+ * Reset the owner ID of a page.
+ */
+static inline void __init_blkio_page_cgroup(struct page_cgroup *pc)
+{
+	lock_page_cgroup(pc);
+	page_cgroup_set_id(pc, 0);
+	unlock_page_cgroup(pc);
+}
+
+/**
+ * blkio_cgroup_disabled - check whether blkio_cgroup is disabled
+ *
+ * Returns true if disabled, false if not.
+ */
+static inline bool blkio_cgroup_disabled(void)
+{
+	if (blkio_cgroup_subsys.disabled)
+		return true;
+	return false;
+}
+
+extern void blkio_cgroup_set_owner(struct page *page, struct mm_struct *mm);
+extern void blkio_cgroup_reset_owner(struct page *page, struct mm_struct *mm);
+extern void blkio_cgroup_reset_owner_pagedirty(struct page *page,
+						 struct mm_struct *mm);
+extern void blkio_cgroup_copy_owner(struct page *page, struct page *opage);
+
+extern struct io_context *get_blkio_cgroup_iocontext(struct bio *bio);
+extern unsigned long get_blkio_cgroup_id(struct bio *bio);
+extern struct cgroup *blkio_cgroup_lookup(int id);
+
+#else	/* CONFIG_CGROUP_BIO */
+
+struct blkio_cgroup;
+
+static inline void __init_blkio_page_cgroup(struct page_cgroup *pc)
+{
+}
+
+static inline bool blkio_cgroup_disabled(void)
+{
+	return true;
+}
+
+static inline void blkio_cgroup_set_owner(struct page *page, struct mm_struct *mm)
+{
+}
+
+static inline void blkio_cgroup_reset_owner(struct page *page,
+						struct mm_struct *mm)
+{
+}
+
+static inline void blkio_cgroup_reset_owner_pagedirty(struct page *page,
+						struct mm_struct *mm)
+{
+}
+
+static inline void blkio_cgroup_copy_owner(struct page *page, struct page *opage)
+{
+}
+
+static inline struct io_context *get_blkio_cgroup_iocontext(struct bio *bio)
+{
+	return NULL;
+}
+
+static inline unsigned long get_blkio_cgroup_id(struct bio *bio)
+{
+	return 0;
+}
+
+#endif	/* CONFIG_CGROUP_BLKIO */
+
+#endif /* _LINUX_BIOTRACK_H */
Index: linux-2.6.30-rc3-git3/include/linux/cgroup_subsys.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/cgroup_subsys.h
+++ linux-2.6.30-rc3-git3/include/linux/cgroup_subsys.h
@@ -43,6 +43,12 @@ SUBSYS(mem_cgroup)
 
 /* */
 
+#ifdef CONFIG_CGROUP_BLKIO
+SUBSYS(blkio_cgroup)
+#endif
+
+/* */
+
 #ifdef CONFIG_CGROUP_DEVICE
 SUBSYS(devices)
 #endif
Index: linux-2.6.30-rc3-git3/include/linux/page_cgroup.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/page_cgroup.h
+++ linux-2.6.30-rc3-git3/include/linux/page_cgroup.h
@@ -124,4 +124,27 @@ static inline void swap_cgroup_swapoff(i
 }
 
 #endif
+
+#ifdef CONFIG_CGROUP_BLKIO
+/*
+ * use lower 16 bits for flags and reserve the rest for the page tracking id
+ */
+#define PCG_TRACKING_ID_SHIFT	(16)
+#define PCG_TRACKING_ID_BITS \
+	(8 * sizeof(unsigned long) - PCG_TRACKING_ID_SHIFT)
+
+/* NOTE: must be called with page_cgroup() held */
+static inline unsigned long page_cgroup_get_id(struct page_cgroup *pc)
+{
+	return pc->flags >> PCG_TRACKING_ID_SHIFT;
+}
+
+/* NOTE: must be called with page_cgroup() held */
+static inline void page_cgroup_set_id(struct page_cgroup *pc, unsigned long id)
+{
+	WARN_ON(id >= (1UL << PCG_TRACKING_ID_BITS));
+	pc->flags &= (1UL << PCG_TRACKING_ID_SHIFT) - 1;
+	pc->flags |= (unsigned long)(id << PCG_TRACKING_ID_SHIFT);
+}
+#endif
 #endif
Index: linux-2.6.30-rc3-git3/init/Kconfig
===================================================================
--- linux-2.6.30-rc3-git3.orig/init/Kconfig
+++ linux-2.6.30-rc3-git3/init/Kconfig
@@ -608,9 +608,20 @@ config CGROUP_MEM_RES_CTLR_SWAP
 
 endif # CGROUPS
 
+config CGROUP_BLKIO
+	bool "Block I/O cgroup subsystem"
+	depends on CGROUPS && BLOCK
+	select MM_OWNER
+	help
+	  Provides a Resource Controller which enables to track the onwner
+	  of every Block I/O requests.
+	  The information this subsystem provides can be used from any
+	  kind of module such as dm-ioband device mapper modules or
+	  the cfq-scheduler.
+
 config CGROUP_PAGE
 	def_bool y
-	depends on CGROUP_MEM_RES_CTLR
+	depends on CGROUP_MEM_RES_CTLR || CGROUP_BLKIO
 
 config MM_OWNER
 	bool
Index: linux-2.6.30-rc3-git3/mm/biotrack.c
===================================================================
--- /dev/null
+++ linux-2.6.30-rc3-git3/mm/biotrack.c
@@ -0,0 +1,300 @@
+/* biotrack.c - Block I/O Tracking
+ *
+ * Copyright (C) VA Linux Systems Japan, 2008-2009
+ * Developed by Hirokazu Takahashi <taka@valinux.co.jp>
+ *
+ * Copyright (C) 2008 Andrea Righi <righi.andrea@gmail.com>
+ * Use part of page_cgroup->flags to store blkio-cgroup ID.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/module.h>
+#include <linux/smp.h>
+#include <linux/bit_spinlock.h>
+#include <linux/blkdev.h>
+#include <linux/biotrack.h>
+#include <linux/mm_inline.h>
+
+/*
+ * The block I/O tracking mechanism is implemented on the cgroup memory
+ * controller framework. It helps to find the the owner of an I/O request
+ * because every I/O request has a target page and the owner of the page
+ * can be easily determined on the framework.
+ */
+
+/* Return the blkio_cgroup that associates with a cgroup. */
+static inline struct blkio_cgroup *cgroup_blkio(struct cgroup *cgrp)
+{
+	return container_of(cgroup_subsys_state(cgrp, blkio_cgroup_subsys_id),
+					struct blkio_cgroup, css);
+}
+
+/* Return the blkio_cgroup that associates with a process. */
+static inline struct blkio_cgroup *blkio_cgroup_from_task(struct task_struct *p)
+{
+	return container_of(task_subsys_state(p, blkio_cgroup_subsys_id),
+					struct blkio_cgroup, css);
+}
+
+static struct io_context default_blkio_io_context;
+static struct blkio_cgroup default_blkio_cgroup = {
+	.io_context	= &default_blkio_io_context,
+};
+
+/**
+ * blkio_cgroup_set_owner() - set the owner ID of a page.
+ * @page:	the page we want to tag
+ * @mm:		the mm_struct of a page owner
+ *
+ * Make a given page have the blkio-cgroup ID of the owner of this page.
+ */
+void blkio_cgroup_set_owner(struct page *page, struct mm_struct *mm)
+{
+	struct blkio_cgroup *biog;
+	struct page_cgroup *pc;
+	unsigned long id;
+
+	if (blkio_cgroup_disabled())
+		return;
+	pc = lookup_page_cgroup(page);
+	if (unlikely(!pc))
+		return;
+
+	lock_page_cgroup(pc);
+	page_cgroup_set_id(pc, 0);	/* 0: default blkio_cgroup id */
+	unlock_page_cgroup(pc);
+	if (!mm)
+		return;
+
+	rcu_read_lock();
+	biog = blkio_cgroup_from_task(rcu_dereference(mm->owner));
+	if (unlikely(!biog)) {
+		rcu_read_unlock();
+		return;
+	}
+	/*
+	 * css_get(&bio->css) isn't called to increment the reference
+	 * count of this blkio_cgroup "biog" so the css_id might turn
+	 * invalid even if this page is still active.
+	 * This approach is chosen to minimize the overhead.
+	 */
+	id = css_id(&biog->css);
+	rcu_read_unlock();
+	lock_page_cgroup(pc);
+	page_cgroup_set_id(pc, id);
+	unlock_page_cgroup(pc);
+}
+
+/**
+ * blkio_cgroup_reset_owner() - reset the owner ID of a page
+ * @page:	the page we want to tag
+ * @mm:		the mm_struct of a page owner
+ *
+ * Change the owner of a given page if necessary.
+ */
+void blkio_cgroup_reset_owner(struct page *page, struct mm_struct *mm)
+{
+	blkio_cgroup_set_owner(page, mm);
+}
+
+/**
+ * blkio_cgroup_reset_owner_pagedirty() - reset the owner ID of a pagecache page
+ * @page:	the page we want to tag
+ * @mm:		the mm_struct of a page owner
+ *
+ * Change the owner of a given page if the page is in the pagecache.
+ */
+void blkio_cgroup_reset_owner_pagedirty(struct page *page, struct mm_struct *mm)
+{
+	if (!page_is_file_cache(page))
+		return;
+	if (current->flags & PF_MEMALLOC)
+		return;
+
+	blkio_cgroup_reset_owner(page, mm);
+}
+
+/**
+ * blkio_cgroup_copy_owner() - copy the owner ID of a page into another page
+ * @npage:	the page where we want to copy the owner
+ * @opage:	the page from which we want to copy the ID
+ *
+ * Copy the owner ID of @opage into @npage.
+ */
+void blkio_cgroup_copy_owner(struct page *npage, struct page *opage)
+{
+	struct page_cgroup *npc, *opc;
+	unsigned long id;
+
+	if (blkio_cgroup_disabled())
+		return;
+	npc = lookup_page_cgroup(npage);
+	if (unlikely(!npc))
+		return;
+	opc = lookup_page_cgroup(opage);
+	if (unlikely(!opc))
+		return;
+
+	lock_page_cgroup(opc);
+	lock_page_cgroup(npc);
+	id = page_cgroup_get_id(opc);
+	page_cgroup_set_id(npc, id);
+	unlock_page_cgroup(npc);
+	unlock_page_cgroup(opc);
+}
+
+/* Create a new blkio-cgroup. */
+static struct cgroup_subsys_state *
+blkio_cgroup_create(struct cgroup_subsys *ss, struct cgroup *cgrp)
+{
+	struct blkio_cgroup *biog;
+	struct io_context *ioc;
+
+	if (!cgrp->parent) {
+		biog = &default_blkio_cgroup;
+		init_io_context(biog->io_context);
+		/* Increment the referrence count not to be released ever. */
+		atomic_inc(&biog->io_context->refcount);
+		return &biog->css;
+	}
+
+	biog = kzalloc(sizeof(*biog), GFP_KERNEL);
+	if (!biog)
+		return ERR_PTR(-ENOMEM);
+	ioc = alloc_io_context(GFP_KERNEL, -1);
+	if (!ioc) {
+		kfree(biog);
+		return ERR_PTR(-ENOMEM);
+	}
+	biog->io_context = ioc;
+	return &biog->css;
+}
+
+/* Delete the blkio-cgroup. */
+static void blkio_cgroup_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp)
+{
+	struct blkio_cgroup *biog = cgroup_blkio(cgrp);
+
+	put_io_context(biog->io_context);
+	free_css_id(&blkio_cgroup_subsys, &biog->css);
+	kfree(biog);
+}
+
+/**
+ * get_blkio_cgroup_id() - determine the blkio-cgroup ID
+ * @bio:	the &struct bio which describes the I/O
+ *
+ * Returns the blkio-cgroup ID of a given bio. A return value zero
+ * means that the page associated with the bio belongs to default_blkio_cgroup.
+ */
+unsigned long get_blkio_cgroup_id(struct bio *bio)
+{
+	struct page_cgroup *pc;
+	struct page *page = bio_iovec_idx(bio, 0)->bv_page;
+	unsigned long id = 0;
+
+	pc = lookup_page_cgroup(page);
+	if (pc) {
+		lock_page_cgroup(pc);
+		id = page_cgroup_get_id(pc);
+		unlock_page_cgroup(pc);
+	}
+	return id;
+}
+
+/**
+ * get_blkio_cgroup_iocontext() - determine the blkio-cgroup iocontext
+ * @bio:	the &struct bio which describe the I/O
+ *
+ * Returns the iocontext of blkio-cgroup that issued a given bio.
+ */
+struct io_context *get_blkio_cgroup_iocontext(struct bio *bio)
+{
+	struct cgroup_subsys_state *css;
+	struct blkio_cgroup *biog;
+	struct io_context *ioc;
+	unsigned long id;
+
+	id = get_blkio_cgroup_id(bio);
+	rcu_read_lock();
+	css = css_lookup(&blkio_cgroup_subsys, id);
+	if (css)
+		biog = container_of(css, struct blkio_cgroup, css);
+	else
+		biog = &default_blkio_cgroup;
+	ioc = biog->io_context;	/* default io_context for this cgroup */
+	atomic_inc(&ioc->refcount);
+	rcu_read_unlock();
+	return ioc;
+}
+
+/**
+ * blkio_cgroup_lookup() - lookup a cgroup by blkio-cgroup ID
+ * @id:		blkio-cgroup ID
+ *
+ * Returns the cgroup associated with the specified ID, or NULL if lookup
+ * fails.
+ *
+ * Note:
+ * This function should be called under rcu_read_lock().
+ */
+struct cgroup *blkio_cgroup_lookup(int id)
+{
+	struct cgroup *cgrp;
+	struct cgroup_subsys_state *css;
+
+	if (blkio_cgroup_disabled())
+		return NULL;
+
+	css = css_lookup(&blkio_cgroup_subsys, id);
+	if (!css)
+		return NULL;
+	cgrp = css->cgroup;
+	return cgrp;
+}
+EXPORT_SYMBOL(get_blkio_cgroup_iocontext);
+EXPORT_SYMBOL(get_blkio_cgroup_id);
+EXPORT_SYMBOL(blkio_cgroup_lookup);
+
+static u64 blkio_id_read(struct cgroup *cgrp, struct cftype *cft)
+{
+	struct blkio_cgroup *biog = cgroup_blkio(cgrp);
+	unsigned long id;
+
+	rcu_read_lock();
+	id = css_id(&biog->css);
+	rcu_read_unlock();
+	return (u64)id;
+}
+
+
+static struct cftype blkio_files[] = {
+	{
+		.name = "id",
+		.read_u64 = blkio_id_read,
+	},
+};
+
+static int blkio_cgroup_populate(struct cgroup_subsys *ss, struct cgroup *cgrp)
+{
+	return cgroup_add_files(cgrp, ss, blkio_files,
+					ARRAY_SIZE(blkio_files));
+}
+
+struct cgroup_subsys blkio_cgroup_subsys = {
+	.name		= "blkio",
+	.create		= blkio_cgroup_create,
+	.destroy	= blkio_cgroup_destroy,
+	.populate	= blkio_cgroup_populate,
+	.subsys_id	= blkio_cgroup_subsys_id,
+	.use_id		= 1,
+};
Index: linux-2.6.30-rc3-git3/mm/page_cgroup.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/page_cgroup.c
+++ linux-2.6.30-rc3-git3/mm/page_cgroup.c
@@ -9,6 +9,7 @@
 #include <linux/vmalloc.h>
 #include <linux/cgroup.h>
 #include <linux/swapops.h>
+#include <linux/biotrack.h>
 
 static void __meminit
 __init_page_cgroup(struct page_cgroup *pc, unsigned long pfn)
@@ -16,6 +17,7 @@ __init_page_cgroup(struct page_cgroup *p
 	pc->flags = 0;
 	pc->page = pfn_to_page(pfn);
 	__init_mem_page_cgroup(pc);
+	__init_blkio_page_cgroup(pc);
 }
 static unsigned long total_usage;
 
@@ -73,7 +75,7 @@ void __init page_cgroup_init(void)
 
 	int nid, fail;
 
-	if (mem_cgroup_disabled())
+	if (mem_cgroup_disabled() && blkio_cgroup_disabled())
 		return;
 
 	for_each_online_node(nid)  {
@@ -82,12 +84,12 @@ void __init page_cgroup_init(void)
 			goto fail;
 	}
 	printk(KERN_INFO "allocated %ld bytes of page_cgroup\n", total_usage);
-	printk(KERN_INFO "please try cgroup_disable=memory option if you"
+	printk(KERN_INFO "please try cgroup_disable=memory,blkio option if you"
 	" don't want\n");
 	return;
 fail:
 	printk(KERN_CRIT "allocation of page_cgroup was failed.\n");
-	printk(KERN_CRIT "please try cgroup_disable=memory boot option\n");
+	printk(KERN_CRIT "please try cgroup_disable=memory,blkio boot options\n");
 	panic("Out of memory");
 }
 
@@ -247,7 +249,7 @@ void __init page_cgroup_init(void)
 	unsigned long pfn;
 	int fail = 0;
 
-	if (mem_cgroup_disabled())
+	if (mem_cgroup_disabled() && blkio_cgroup_disabled())
 		return;
 
 	for (pfn = 0; !fail && pfn < max_pfn; pfn += PAGES_PER_SECTION) {
@@ -262,8 +264,8 @@ void __init page_cgroup_init(void)
 		hotplug_memory_notifier(page_cgroup_callback, 0);
 	}
 	printk(KERN_INFO "allocated %ld bytes of page_cgroup\n", total_usage);
-	printk(KERN_INFO "please try cgroup_disable=memory option if you don't"
-	" want\n");
+	printk(KERN_INFO "please try cgroup_disable=memory,blkio option"
+	" if you don't want\n");
 }
 
 void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat)
Index: linux-2.6.30-rc3-git3/mm/Makefile
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/Makefile
+++ linux-2.6.30-rc3-git3/mm/Makefile
@@ -39,3 +39,4 @@ endif
 obj-$(CONFIG_QUICKLIST) += quicklist.o
 obj-$(CONFIG_CGROUP_MEM_RES_CTLR) += memcontrol.o
 obj-$(CONFIG_CGROUP_PAGE) += page_cgroup.o
+obj-$(CONFIG_CGROUP_BLKIO) += biotrack.o

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 5/7] blkio-cgroup: Page tracking hooks
       [not found]       ` <20090428.214604.104034341.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
@ 2009-04-28 12:46         ` Ryo Tsuruta
  2009-05-04 12:05         ` [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup Andrea Righi
  1 sibling, 0 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-04-28 12:46 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	xen-devel-GuqFBffKawuULHF6PoxzQEEOCMrvLtNR

This patch contains several hooks that let the blkio-cgroup framework to know
which blkio-cgroup is the owner of a page before starting I/O against the page.

Based on 2.6.30-rc3-git3
Signed-off-by: Hirokazu Takahashi <taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
Signed-off-by: Ryo Tsuruta <ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>

---
 fs/buffer.c         |    2 ++
 fs/direct-io.c      |    2 ++
 mm/bounce.c         |    2 ++
 mm/filemap.c        |    2 ++
 mm/memory.c         |    5 +++++
 mm/page-writeback.c |    2 ++
 mm/swap_state.c     |    2 ++
 7 files changed, 17 insertions(+)

Index: linux-2.6.30-rc3-git3/fs/buffer.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/fs/buffer.c
+++ linux-2.6.30-rc3-git3/fs/buffer.c
@@ -36,6 +36,7 @@
 #include <linux/buffer_head.h>
 #include <linux/task_io_accounting_ops.h>
 #include <linux/bio.h>
+#include <linux/biotrack.h>
 #include <linux/notifier.h>
 #include <linux/cpu.h>
 #include <linux/bitops.h>
@@ -668,6 +669,7 @@ static void __set_page_dirty(struct page
 	if (page->mapping) {	/* Race with truncate? */
 		WARN_ON_ONCE(warn && !PageUptodate(page));
 		account_page_dirtied(page, mapping);
+		blkio_cgroup_reset_owner_pagedirty(page, current->mm);
 		radix_tree_tag_set(&mapping->page_tree,
 				page_index(page), PAGECACHE_TAG_DIRTY);
 	}
Index: linux-2.6.30-rc3-git3/fs/direct-io.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/fs/direct-io.c
+++ linux-2.6.30-rc3-git3/fs/direct-io.c
@@ -33,6 +33,7 @@
 #include <linux/err.h>
 #include <linux/blkdev.h>
 #include <linux/buffer_head.h>
+#include <linux/biotrack.h>
 #include <linux/rwsem.h>
 #include <linux/uio.h>
 #include <asm/atomic.h>
@@ -797,6 +798,7 @@ static int do_direct_IO(struct dio *dio)
 			ret = PTR_ERR(page);
 			goto out;
 		}
+		blkio_cgroup_reset_owner(page, current->mm);
 
 		while (block_in_page < blocks_per_page) {
 			unsigned offset_in_page = block_in_page << blkbits;
Index: linux-2.6.30-rc3-git3/mm/bounce.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/bounce.c
+++ linux-2.6.30-rc3-git3/mm/bounce.c
@@ -14,6 +14,7 @@
 #include <linux/hash.h>
 #include <linux/highmem.h>
 #include <linux/blktrace_api.h>
+#include <linux/biotrack.h>
 #include <trace/block.h>
 #include <asm/tlbflush.h>
 
@@ -212,6 +213,7 @@ static void __blk_queue_bounce(struct re
 		to->bv_len = from->bv_len;
 		to->bv_offset = from->bv_offset;
 		inc_zone_page_state(to->bv_page, NR_BOUNCE);
+		blkio_cgroup_copy_owner(to->bv_page, page);
 
 		if (rw == WRITE) {
 			char *vto, *vfrom;
Index: linux-2.6.30-rc3-git3/mm/filemap.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/filemap.c
+++ linux-2.6.30-rc3-git3/mm/filemap.c
@@ -33,6 +33,7 @@
 #include <linux/cpuset.h>
 #include <linux/hardirq.h> /* for BUG_ON(!in_atomic()) only */
 #include <linux/memcontrol.h>
+#include <linux/biotrack.h>
 #include <linux/mm_inline.h> /* for page_is_file_cache() */
 #include "internal.h"
 
@@ -464,6 +465,7 @@ int add_to_page_cache_locked(struct page
 					gfp_mask & GFP_RECLAIM_MASK);
 	if (error)
 		goto out;
+	blkio_cgroup_set_owner(page, current->mm);
 
 	error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM);
 	if (error == 0) {
Index: linux-2.6.30-rc3-git3/mm/memory.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/memory.c
+++ linux-2.6.30-rc3-git3/mm/memory.c
@@ -51,6 +51,7 @@
 #include <linux/init.h>
 #include <linux/writeback.h>
 #include <linux/memcontrol.h>
+#include <linux/biotrack.h>
 #include <linux/mmu_notifier.h>
 #include <linux/kallsyms.h>
 #include <linux/swapops.h>
@@ -2053,6 +2054,7 @@ gotten:
 		 */
 		ptep_clear_flush_notify(vma, address, page_table);
 		page_add_new_anon_rmap(new_page, vma, address);
+		blkio_cgroup_set_owner(new_page, mm);
 		set_pte_at(mm, address, page_table, entry);
 		update_mmu_cache(vma, address, entry);
 		if (old_page) {
@@ -2497,6 +2499,7 @@ static int do_swap_page(struct mm_struct
 	flush_icache_page(vma, page);
 	set_pte_at(mm, address, page_table, pte);
 	page_add_anon_rmap(page, vma, address);
+	blkio_cgroup_reset_owner(page, mm);
 	/* It's better to call commit-charge after rmap is established */
 	mem_cgroup_commit_charge_swapin(page, ptr);
 
@@ -2560,6 +2563,7 @@ static int do_anonymous_page(struct mm_s
 		goto release;
 	inc_mm_counter(mm, anon_rss);
 	page_add_new_anon_rmap(page, vma, address);
+	blkio_cgroup_set_owner(page, mm);
 	set_pte_at(mm, address, page_table, entry);
 
 	/* No need to invalidate - it was non-present before */
@@ -2712,6 +2716,7 @@ static int __do_fault(struct mm_struct *
 		if (anon) {
 			inc_mm_counter(mm, anon_rss);
 			page_add_new_anon_rmap(page, vma, address);
+			blkio_cgroup_set_owner(page, mm);
 		} else {
 			inc_mm_counter(mm, file_rss);
 			page_add_file_rmap(page);
Index: linux-2.6.30-rc3-git3/mm/page-writeback.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/page-writeback.c
+++ linux-2.6.30-rc3-git3/mm/page-writeback.c
@@ -23,6 +23,7 @@
 #include <linux/init.h>
 #include <linux/backing-dev.h>
 #include <linux/task_io_accounting_ops.h>
+#include <linux/biotrack.h>
 #include <linux/blkdev.h>
 #include <linux/mpage.h>
 #include <linux/rmap.h>
@@ -1243,6 +1244,7 @@ int __set_page_dirty_nobuffers(struct pa
 			BUG_ON(mapping2 != mapping);
 			WARN_ON_ONCE(!PagePrivate(page) && !PageUptodate(page));
 			account_page_dirtied(page, mapping);
+			blkio_cgroup_reset_owner_pagedirty(page, current->mm);
 			radix_tree_tag_set(&mapping->page_tree,
 				page_index(page), PAGECACHE_TAG_DIRTY);
 		}
Index: linux-2.6.30-rc3-git3/mm/swap_state.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/swap_state.c
+++ linux-2.6.30-rc3-git3/mm/swap_state.c
@@ -18,6 +18,7 @@
 #include <linux/pagevec.h>
 #include <linux/migrate.h>
 #include <linux/page_cgroup.h>
+#include <linux/biotrack.h>
 
 #include <asm/pgtable.h>
 
@@ -308,6 +309,7 @@ struct page *read_swap_cache_async(swp_e
 		 */
 		__set_page_locked(new_page);
 		SetPageSwapBacked(new_page);
+		blkio_cgroup_set_owner(new_page, current->mm);
 		err = add_to_swap_cache(new_page, entry, gfp_mask & GFP_KERNEL);
 		if (likely(!err)) {
 			/*

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 5/7] blkio-cgroup: Page tracking hooks
  2009-04-28 12:46     ` [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup Ryo Tsuruta
@ 2009-04-28 12:46       ` Ryo Tsuruta
       [not found]         ` <20090428.214632.189725895.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
                           ` (2 more replies)
       [not found]       ` <20090428.214604.104034341.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
                         ` (3 subsequent siblings)
  4 siblings, 3 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-04-28 12:46 UTC (permalink / raw)
  To: linux-kernel, dm-devel, containers, virtualization, xen-devel

This patch contains several hooks that let the blkio-cgroup framework to know
which blkio-cgroup is the owner of a page before starting I/O against the page.

Based on 2.6.30-rc3-git3
Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
Signed-off-by: Ryo Tsuruta <ryov@valinux.co.jp>

---
 fs/buffer.c         |    2 ++
 fs/direct-io.c      |    2 ++
 mm/bounce.c         |    2 ++
 mm/filemap.c        |    2 ++
 mm/memory.c         |    5 +++++
 mm/page-writeback.c |    2 ++
 mm/swap_state.c     |    2 ++
 7 files changed, 17 insertions(+)

Index: linux-2.6.30-rc3-git3/fs/buffer.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/fs/buffer.c
+++ linux-2.6.30-rc3-git3/fs/buffer.c
@@ -36,6 +36,7 @@
 #include <linux/buffer_head.h>
 #include <linux/task_io_accounting_ops.h>
 #include <linux/bio.h>
+#include <linux/biotrack.h>
 #include <linux/notifier.h>
 #include <linux/cpu.h>
 #include <linux/bitops.h>
@@ -668,6 +669,7 @@ static void __set_page_dirty(struct page
 	if (page->mapping) {	/* Race with truncate? */
 		WARN_ON_ONCE(warn && !PageUptodate(page));
 		account_page_dirtied(page, mapping);
+		blkio_cgroup_reset_owner_pagedirty(page, current->mm);
 		radix_tree_tag_set(&mapping->page_tree,
 				page_index(page), PAGECACHE_TAG_DIRTY);
 	}
Index: linux-2.6.30-rc3-git3/fs/direct-io.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/fs/direct-io.c
+++ linux-2.6.30-rc3-git3/fs/direct-io.c
@@ -33,6 +33,7 @@
 #include <linux/err.h>
 #include <linux/blkdev.h>
 #include <linux/buffer_head.h>
+#include <linux/biotrack.h>
 #include <linux/rwsem.h>
 #include <linux/uio.h>
 #include <asm/atomic.h>
@@ -797,6 +798,7 @@ static int do_direct_IO(struct dio *dio)
 			ret = PTR_ERR(page);
 			goto out;
 		}
+		blkio_cgroup_reset_owner(page, current->mm);
 
 		while (block_in_page < blocks_per_page) {
 			unsigned offset_in_page = block_in_page << blkbits;
Index: linux-2.6.30-rc3-git3/mm/bounce.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/bounce.c
+++ linux-2.6.30-rc3-git3/mm/bounce.c
@@ -14,6 +14,7 @@
 #include <linux/hash.h>
 #include <linux/highmem.h>
 #include <linux/blktrace_api.h>
+#include <linux/biotrack.h>
 #include <trace/block.h>
 #include <asm/tlbflush.h>
 
@@ -212,6 +213,7 @@ static void __blk_queue_bounce(struct re
 		to->bv_len = from->bv_len;
 		to->bv_offset = from->bv_offset;
 		inc_zone_page_state(to->bv_page, NR_BOUNCE);
+		blkio_cgroup_copy_owner(to->bv_page, page);
 
 		if (rw == WRITE) {
 			char *vto, *vfrom;
Index: linux-2.6.30-rc3-git3/mm/filemap.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/filemap.c
+++ linux-2.6.30-rc3-git3/mm/filemap.c
@@ -33,6 +33,7 @@
 #include <linux/cpuset.h>
 #include <linux/hardirq.h> /* for BUG_ON(!in_atomic()) only */
 #include <linux/memcontrol.h>
+#include <linux/biotrack.h>
 #include <linux/mm_inline.h> /* for page_is_file_cache() */
 #include "internal.h"
 
@@ -464,6 +465,7 @@ int add_to_page_cache_locked(struct page
 					gfp_mask & GFP_RECLAIM_MASK);
 	if (error)
 		goto out;
+	blkio_cgroup_set_owner(page, current->mm);
 
 	error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM);
 	if (error == 0) {
Index: linux-2.6.30-rc3-git3/mm/memory.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/memory.c
+++ linux-2.6.30-rc3-git3/mm/memory.c
@@ -51,6 +51,7 @@
 #include <linux/init.h>
 #include <linux/writeback.h>
 #include <linux/memcontrol.h>
+#include <linux/biotrack.h>
 #include <linux/mmu_notifier.h>
 #include <linux/kallsyms.h>
 #include <linux/swapops.h>
@@ -2053,6 +2054,7 @@ gotten:
 		 */
 		ptep_clear_flush_notify(vma, address, page_table);
 		page_add_new_anon_rmap(new_page, vma, address);
+		blkio_cgroup_set_owner(new_page, mm);
 		set_pte_at(mm, address, page_table, entry);
 		update_mmu_cache(vma, address, entry);
 		if (old_page) {
@@ -2497,6 +2499,7 @@ static int do_swap_page(struct mm_struct
 	flush_icache_page(vma, page);
 	set_pte_at(mm, address, page_table, pte);
 	page_add_anon_rmap(page, vma, address);
+	blkio_cgroup_reset_owner(page, mm);
 	/* It's better to call commit-charge after rmap is established */
 	mem_cgroup_commit_charge_swapin(page, ptr);
 
@@ -2560,6 +2563,7 @@ static int do_anonymous_page(struct mm_s
 		goto release;
 	inc_mm_counter(mm, anon_rss);
 	page_add_new_anon_rmap(page, vma, address);
+	blkio_cgroup_set_owner(page, mm);
 	set_pte_at(mm, address, page_table, entry);
 
 	/* No need to invalidate - it was non-present before */
@@ -2712,6 +2716,7 @@ static int __do_fault(struct mm_struct *
 		if (anon) {
 			inc_mm_counter(mm, anon_rss);
 			page_add_new_anon_rmap(page, vma, address);
+			blkio_cgroup_set_owner(page, mm);
 		} else {
 			inc_mm_counter(mm, file_rss);
 			page_add_file_rmap(page);
Index: linux-2.6.30-rc3-git3/mm/page-writeback.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/page-writeback.c
+++ linux-2.6.30-rc3-git3/mm/page-writeback.c
@@ -23,6 +23,7 @@
 #include <linux/init.h>
 #include <linux/backing-dev.h>
 #include <linux/task_io_accounting_ops.h>
+#include <linux/biotrack.h>
 #include <linux/blkdev.h>
 #include <linux/mpage.h>
 #include <linux/rmap.h>
@@ -1243,6 +1244,7 @@ int __set_page_dirty_nobuffers(struct pa
 			BUG_ON(mapping2 != mapping);
 			WARN_ON_ONCE(!PagePrivate(page) && !PageUptodate(page));
 			account_page_dirtied(page, mapping);
+			blkio_cgroup_reset_owner_pagedirty(page, current->mm);
 			radix_tree_tag_set(&mapping->page_tree,
 				page_index(page), PAGECACHE_TAG_DIRTY);
 		}
Index: linux-2.6.30-rc3-git3/mm/swap_state.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/swap_state.c
+++ linux-2.6.30-rc3-git3/mm/swap_state.c
@@ -18,6 +18,7 @@
 #include <linux/pagevec.h>
 #include <linux/migrate.h>
 #include <linux/page_cgroup.h>
+#include <linux/biotrack.h>
 
 #include <asm/pgtable.h>
 
@@ -308,6 +309,7 @@ struct page *read_swap_cache_async(swp_e
 		 */
 		__set_page_locked(new_page);
 		SetPageSwapBacked(new_page);
+		blkio_cgroup_set_owner(new_page, current->mm);
 		err = add_to_swap_cache(new_page, entry, gfp_mask & GFP_KERNEL);
 		if (likely(!err)) {
 			/*

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 5/7] blkio-cgroup: Page tracking hooks
  2009-04-28 12:46     ` [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup Ryo Tsuruta
  2009-04-28 12:46       ` [PATCH 5/7] blkio-cgroup: Page tracking hooks Ryo Tsuruta
       [not found]       ` <20090428.214604.104034341.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
@ 2009-04-28 12:46       ` Ryo Tsuruta
  2009-05-04 12:05       ` [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup Andrea Righi
  2009-05-04 12:05         ` Andrea Righi
  4 siblings, 0 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-04-28 12:46 UTC (permalink / raw)
  To: linux-kernel, dm-devel, containers, virtualization, xen-devel

This patch contains several hooks that let the blkio-cgroup framework to know
which blkio-cgroup is the owner of a page before starting I/O against the page.

Based on 2.6.30-rc3-git3
Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
Signed-off-by: Ryo Tsuruta <ryov@valinux.co.jp>

---
 fs/buffer.c         |    2 ++
 fs/direct-io.c      |    2 ++
 mm/bounce.c         |    2 ++
 mm/filemap.c        |    2 ++
 mm/memory.c         |    5 +++++
 mm/page-writeback.c |    2 ++
 mm/swap_state.c     |    2 ++
 7 files changed, 17 insertions(+)

Index: linux-2.6.30-rc3-git3/fs/buffer.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/fs/buffer.c
+++ linux-2.6.30-rc3-git3/fs/buffer.c
@@ -36,6 +36,7 @@
 #include <linux/buffer_head.h>
 #include <linux/task_io_accounting_ops.h>
 #include <linux/bio.h>
+#include <linux/biotrack.h>
 #include <linux/notifier.h>
 #include <linux/cpu.h>
 #include <linux/bitops.h>
@@ -668,6 +669,7 @@ static void __set_page_dirty(struct page
 	if (page->mapping) {	/* Race with truncate? */
 		WARN_ON_ONCE(warn && !PageUptodate(page));
 		account_page_dirtied(page, mapping);
+		blkio_cgroup_reset_owner_pagedirty(page, current->mm);
 		radix_tree_tag_set(&mapping->page_tree,
 				page_index(page), PAGECACHE_TAG_DIRTY);
 	}
Index: linux-2.6.30-rc3-git3/fs/direct-io.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/fs/direct-io.c
+++ linux-2.6.30-rc3-git3/fs/direct-io.c
@@ -33,6 +33,7 @@
 #include <linux/err.h>
 #include <linux/blkdev.h>
 #include <linux/buffer_head.h>
+#include <linux/biotrack.h>
 #include <linux/rwsem.h>
 #include <linux/uio.h>
 #include <asm/atomic.h>
@@ -797,6 +798,7 @@ static int do_direct_IO(struct dio *dio)
 			ret = PTR_ERR(page);
 			goto out;
 		}
+		blkio_cgroup_reset_owner(page, current->mm);
 
 		while (block_in_page < blocks_per_page) {
 			unsigned offset_in_page = block_in_page << blkbits;
Index: linux-2.6.30-rc3-git3/mm/bounce.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/bounce.c
+++ linux-2.6.30-rc3-git3/mm/bounce.c
@@ -14,6 +14,7 @@
 #include <linux/hash.h>
 #include <linux/highmem.h>
 #include <linux/blktrace_api.h>
+#include <linux/biotrack.h>
 #include <trace/block.h>
 #include <asm/tlbflush.h>
 
@@ -212,6 +213,7 @@ static void __blk_queue_bounce(struct re
 		to->bv_len = from->bv_len;
 		to->bv_offset = from->bv_offset;
 		inc_zone_page_state(to->bv_page, NR_BOUNCE);
+		blkio_cgroup_copy_owner(to->bv_page, page);
 
 		if (rw == WRITE) {
 			char *vto, *vfrom;
Index: linux-2.6.30-rc3-git3/mm/filemap.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/filemap.c
+++ linux-2.6.30-rc3-git3/mm/filemap.c
@@ -33,6 +33,7 @@
 #include <linux/cpuset.h>
 #include <linux/hardirq.h> /* for BUG_ON(!in_atomic()) only */
 #include <linux/memcontrol.h>
+#include <linux/biotrack.h>
 #include <linux/mm_inline.h> /* for page_is_file_cache() */
 #include "internal.h"
 
@@ -464,6 +465,7 @@ int add_to_page_cache_locked(struct page
 					gfp_mask & GFP_RECLAIM_MASK);
 	if (error)
 		goto out;
+	blkio_cgroup_set_owner(page, current->mm);
 
 	error = radix_tree_preload(gfp_mask & ~__GFP_HIGHMEM);
 	if (error == 0) {
Index: linux-2.6.30-rc3-git3/mm/memory.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/memory.c
+++ linux-2.6.30-rc3-git3/mm/memory.c
@@ -51,6 +51,7 @@
 #include <linux/init.h>
 #include <linux/writeback.h>
 #include <linux/memcontrol.h>
+#include <linux/biotrack.h>
 #include <linux/mmu_notifier.h>
 #include <linux/kallsyms.h>
 #include <linux/swapops.h>
@@ -2053,6 +2054,7 @@ gotten:
 		 */
 		ptep_clear_flush_notify(vma, address, page_table);
 		page_add_new_anon_rmap(new_page, vma, address);
+		blkio_cgroup_set_owner(new_page, mm);
 		set_pte_at(mm, address, page_table, entry);
 		update_mmu_cache(vma, address, entry);
 		if (old_page) {
@@ -2497,6 +2499,7 @@ static int do_swap_page(struct mm_struct
 	flush_icache_page(vma, page);
 	set_pte_at(mm, address, page_table, pte);
 	page_add_anon_rmap(page, vma, address);
+	blkio_cgroup_reset_owner(page, mm);
 	/* It's better to call commit-charge after rmap is established */
 	mem_cgroup_commit_charge_swapin(page, ptr);
 
@@ -2560,6 +2563,7 @@ static int do_anonymous_page(struct mm_s
 		goto release;
 	inc_mm_counter(mm, anon_rss);
 	page_add_new_anon_rmap(page, vma, address);
+	blkio_cgroup_set_owner(page, mm);
 	set_pte_at(mm, address, page_table, entry);
 
 	/* No need to invalidate - it was non-present before */
@@ -2712,6 +2716,7 @@ static int __do_fault(struct mm_struct *
 		if (anon) {
 			inc_mm_counter(mm, anon_rss);
 			page_add_new_anon_rmap(page, vma, address);
+			blkio_cgroup_set_owner(page, mm);
 		} else {
 			inc_mm_counter(mm, file_rss);
 			page_add_file_rmap(page);
Index: linux-2.6.30-rc3-git3/mm/page-writeback.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/page-writeback.c
+++ linux-2.6.30-rc3-git3/mm/page-writeback.c
@@ -23,6 +23,7 @@
 #include <linux/init.h>
 #include <linux/backing-dev.h>
 #include <linux/task_io_accounting_ops.h>
+#include <linux/biotrack.h>
 #include <linux/blkdev.h>
 #include <linux/mpage.h>
 #include <linux/rmap.h>
@@ -1243,6 +1244,7 @@ int __set_page_dirty_nobuffers(struct pa
 			BUG_ON(mapping2 != mapping);
 			WARN_ON_ONCE(!PagePrivate(page) && !PageUptodate(page));
 			account_page_dirtied(page, mapping);
+			blkio_cgroup_reset_owner_pagedirty(page, current->mm);
 			radix_tree_tag_set(&mapping->page_tree,
 				page_index(page), PAGECACHE_TAG_DIRTY);
 		}
Index: linux-2.6.30-rc3-git3/mm/swap_state.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/swap_state.c
+++ linux-2.6.30-rc3-git3/mm/swap_state.c
@@ -18,6 +18,7 @@
 #include <linux/pagevec.h>
 #include <linux/migrate.h>
 #include <linux/page_cgroup.h>
+#include <linux/biotrack.h>
 
 #include <asm/pgtable.h>
 
@@ -308,6 +309,7 @@ struct page *read_swap_cache_async(swp_e
 		 */
 		__set_page_locked(new_page);
 		SetPageSwapBacked(new_page);
+		blkio_cgroup_set_owner(new_page, current->mm);
 		err = add_to_swap_cache(new_page, entry, gfp_mask & GFP_KERNEL);
 		if (likely(!err)) {
 			/*

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 6/7] blkio-cgroup: Add a cgroup support to dm-ioband
       [not found]         ` <20090428.214632.189725895.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
@ 2009-04-28 12:47           ` Ryo Tsuruta
  0 siblings, 0 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-04-28 12:47 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	xen-devel-GuqFBffKawuULHF6PoxzQEEOCMrvLtNR

With this patch, dm-ioband can work with the blkio-cgroup.

Based on 2.6.30-rc3-git3
Signed-off-by: Hirokazu Takahashi <taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
Signed-off-by: Ryo Tsuruta <ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>

---
 drivers/md/dm-ioband-type.c |   10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

Index: linux-2.6.30-rc3-git3/drivers/md/dm-ioband-type.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/drivers/md/dm-ioband-type.c
+++ linux-2.6.30-rc3-git3/drivers/md/dm-ioband-type.c
@@ -6,6 +6,7 @@
  * This file is released under the GPL.
  */
 #include <linux/bio.h>
+#include <linux/biotrack.h>
 #include "dm.h"
 #include "dm-ioband.h"
 
@@ -52,14 +53,7 @@ static int ioband_node(struct bio *bio)
 
 static int ioband_cgroup(struct bio *bio)
 {
-	/*
-	 * This function should return the ID of the cgroup which
-	 * issued "bio". The ID of the cgroup which the current
-	 * process belongs to won't be suitable ID for this purpose,
-	 * since some BIOs will be handled by kernel threads like aio
-	 * or pdflush on behalf of the process requesting the BIOs.
-	 */
-	return 0;	/* not implemented yet */
+	return get_blkio_cgroup_id(bio);
 }
 
 struct group_type dm_ioband_group_type[] = {

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 6/7] blkio-cgroup: Add a cgroup support to dm-ioband
  2009-04-28 12:46       ` [PATCH 5/7] blkio-cgroup: Page tracking hooks Ryo Tsuruta
       [not found]         ` <20090428.214632.189725895.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
@ 2009-04-28 12:47         ` Ryo Tsuruta
       [not found]           ` <20090428.214705.112621770.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
                             ` (2 more replies)
  2009-04-28 12:47         ` [PATCH 6/7] blkio-cgroup: Add a cgroup support to dm-ioband Ryo Tsuruta
  2 siblings, 3 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-04-28 12:47 UTC (permalink / raw)
  To: linux-kernel, dm-devel, containers, virtualization, xen-devel

With this patch, dm-ioband can work with the blkio-cgroup.

Based on 2.6.30-rc3-git3
Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
Signed-off-by: Ryo Tsuruta <ryov@valinux.co.jp>

---
 drivers/md/dm-ioband-type.c |   10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

Index: linux-2.6.30-rc3-git3/drivers/md/dm-ioband-type.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/drivers/md/dm-ioband-type.c
+++ linux-2.6.30-rc3-git3/drivers/md/dm-ioband-type.c
@@ -6,6 +6,7 @@
  * This file is released under the GPL.
  */
 #include <linux/bio.h>
+#include <linux/biotrack.h>
 #include "dm.h"
 #include "dm-ioband.h"
 
@@ -52,14 +53,7 @@ static int ioband_node(struct bio *bio)
 
 static int ioband_cgroup(struct bio *bio)
 {
-	/*
-	 * This function should return the ID of the cgroup which
-	 * issued "bio". The ID of the cgroup which the current
-	 * process belongs to won't be suitable ID for this purpose,
-	 * since some BIOs will be handled by kernel threads like aio
-	 * or pdflush on behalf of the process requesting the BIOs.
-	 */
-	return 0;	/* not implemented yet */
+	return get_blkio_cgroup_id(bio);
 }
 
 struct group_type dm_ioband_group_type[] = {

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 6/7] blkio-cgroup: Add a cgroup support to dm-ioband
  2009-04-28 12:46       ` [PATCH 5/7] blkio-cgroup: Page tracking hooks Ryo Tsuruta
       [not found]         ` <20090428.214632.189725895.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
  2009-04-28 12:47         ` Ryo Tsuruta
@ 2009-04-28 12:47         ` Ryo Tsuruta
  2 siblings, 0 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-04-28 12:47 UTC (permalink / raw)
  To: linux-kernel, dm-devel, containers, virtualization, xen-devel

With this patch, dm-ioband can work with the blkio-cgroup.

Based on 2.6.30-rc3-git3
Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
Signed-off-by: Ryo Tsuruta <ryov@valinux.co.jp>

---
 drivers/md/dm-ioband-type.c |   10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

Index: linux-2.6.30-rc3-git3/drivers/md/dm-ioband-type.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/drivers/md/dm-ioband-type.c
+++ linux-2.6.30-rc3-git3/drivers/md/dm-ioband-type.c
@@ -6,6 +6,7 @@
  * This file is released under the GPL.
  */
 #include <linux/bio.h>
+#include <linux/biotrack.h>
 #include "dm.h"
 #include "dm-ioband.h"
 
@@ -52,14 +53,7 @@ static int ioband_node(struct bio *bio)
 
 static int ioband_cgroup(struct bio *bio)
 {
-	/*
-	 * This function should return the ID of the cgroup which
-	 * issued "bio". The ID of the cgroup which the current
-	 * process belongs to won't be suitable ID for this purpose,
-	 * since some BIOs will be handled by kernel threads like aio
-	 * or pdflush on behalf of the process requesting the BIOs.
-	 */
-	return 0;	/* not implemented yet */
+	return get_blkio_cgroup_id(bio);
 }
 
 struct group_type dm_ioband_group_type[] = {

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 7/7] blkio-cgroup: Fast page tracking
       [not found]           ` <20090428.214705.112621770.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
@ 2009-04-28 12:47             ` Ryo Tsuruta
  0 siblings, 0 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-04-28 12:47 UTC (permalink / raw)
  To: linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	xen-devel-GuqFBffKawuULHF6PoxzQEEOCMrvLtNR

This is an extra patch which reduces the overhead of IO tracking but
increases the size of struct page_cgroup.

Based on 2.6.30-rc3-git3
Signed-off-by: Hirokazu Takahashi <taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
Signed-off-by: Ryo Tsuruta <ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>

---
 include/linux/biotrack.h    |    5 -
 include/linux/page_cgroup.h |   26 --------
 mm/biotrack.c               |  134 ++++++++++++++++++++++++++------------------
 3 files changed, 87 insertions(+), 78 deletions(-)

Index: linux-2.6.30-rc3-git3/mm/biotrack.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/biotrack.c
+++ linux-2.6.30-rc3-git3/mm/biotrack.c
@@ -3,9 +3,6 @@
  * Copyright (C) VA Linux Systems Japan, 2008-2009
  * Developed by Hirokazu Takahashi <taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
  *
- * Copyright (C) 2008 Andrea Righi <righi.andrea-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
- * Use part of page_cgroup->flags to store blkio-cgroup ID.
- *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
  * the Free Software Foundation; either version 2 of the License, or
@@ -20,6 +17,7 @@
 #include <linux/module.h>
 #include <linux/smp.h>
 #include <linux/bit_spinlock.h>
+#include <linux/idr.h>
 #include <linux/blkdev.h>
 #include <linux/biotrack.h>
 #include <linux/mm_inline.h>
@@ -45,8 +43,11 @@ static inline struct blkio_cgroup *blkio
 					struct blkio_cgroup, css);
 }
 
+static struct idr blkio_cgroup_id;
+static DEFINE_SPINLOCK(blkio_cgroup_idr_lock);
 static struct io_context default_blkio_io_context;
 static struct blkio_cgroup default_blkio_cgroup = {
+	.id		= 0,
 	.io_context	= &default_blkio_io_context,
 };
 
@@ -61,7 +62,6 @@ void blkio_cgroup_set_owner(struct page 
 {
 	struct blkio_cgroup *biog;
 	struct page_cgroup *pc;
-	unsigned long id;
 
 	if (blkio_cgroup_disabled())
 		return;
@@ -69,29 +69,27 @@ void blkio_cgroup_set_owner(struct page 
 	if (unlikely(!pc))
 		return;
 
-	lock_page_cgroup(pc);
-	page_cgroup_set_id(pc, 0);	/* 0: default blkio_cgroup id */
-	unlock_page_cgroup(pc);
+	pc->blkio_cgroup_id = 0;	/* 0: default blkio_cgroup id */
 	if (!mm)
 		return;
 
+	/*
+	 * Locking "pc" isn't necessary here since the current process is
+	 * the only one that can access the members related to blkio_cgroup.
+	 */
 	rcu_read_lock();
 	biog = blkio_cgroup_from_task(rcu_dereference(mm->owner));
-	if (unlikely(!biog)) {
-		rcu_read_unlock();
-		return;
-	}
+	if (unlikely(!biog))
+		goto out;
 	/*
 	 * css_get(&bio->css) isn't called to increment the reference
 	 * count of this blkio_cgroup "biog" so the css_id might turn
 	 * invalid even if this page is still active.
 	 * This approach is chosen to minimize the overhead.
 	 */
-	id = css_id(&biog->css);
+	pc->blkio_cgroup_id = biog->id;
+out:
 	rcu_read_unlock();
-	lock_page_cgroup(pc);
-	page_cgroup_set_id(pc, id);
-	unlock_page_cgroup(pc);
 }
 
 /**
@@ -103,6 +101,13 @@ void blkio_cgroup_set_owner(struct page 
  */
 void blkio_cgroup_reset_owner(struct page *page, struct mm_struct *mm)
 {
+	/*
+	 * A little trick:
+	 * Just call blkio_cgroup_set_owner() for pages which are already
+	 * active since the blkio_cgroup_id member of page_cgroup can be
+	 * updated without any locks. This is because an integer type of
+	 * variable can be set a new value at once on modern cpus.
+	 */
 	blkio_cgroup_set_owner(page, mm);
 }
 
@@ -133,7 +138,6 @@ void blkio_cgroup_reset_owner_pagedirty(
 void blkio_cgroup_copy_owner(struct page *npage, struct page *opage)
 {
 	struct page_cgroup *npc, *opc;
-	unsigned long id;
 
 	if (blkio_cgroup_disabled())
 		return;
@@ -144,12 +148,11 @@ void blkio_cgroup_copy_owner(struct page
 	if (unlikely(!opc))
 		return;
 
-	lock_page_cgroup(opc);
-	lock_page_cgroup(npc);
-	id = page_cgroup_get_id(opc);
-	page_cgroup_set_id(npc, id);
-	unlock_page_cgroup(npc);
-	unlock_page_cgroup(opc);
+	/*
+	 * Do this without any locks. The reason is the same as
+	 * blkio_cgroup_reset_owner().
+	 */
+	npc->blkio_cgroup_id = opc->blkio_cgroup_id;
 }
 
 /* Create a new blkio-cgroup. */
@@ -158,25 +161,44 @@ blkio_cgroup_create(struct cgroup_subsys
 {
 	struct blkio_cgroup *biog;
 	struct io_context *ioc;
+	int ret;
 
 	if (!cgrp->parent) {
 		biog = &default_blkio_cgroup;
 		init_io_context(biog->io_context);
 		/* Increment the referrence count not to be released ever. */
 		atomic_inc(&biog->io_context->refcount);
+		idr_init(&blkio_cgroup_id);
 		return &biog->css;
 	}
 
 	biog = kzalloc(sizeof(*biog), GFP_KERNEL);
-	if (!biog)
-		return ERR_PTR(-ENOMEM);
 	ioc = alloc_io_context(GFP_KERNEL, -1);
-	if (!ioc) {
-		kfree(biog);
-		return ERR_PTR(-ENOMEM);
+	if (!ioc || !biog) {
+		ret = -ENOMEM;
+		goto out_err;
 	}
 	biog->io_context = ioc;
+retry:
+	if (!idr_pre_get(&blkio_cgroup_id, GFP_KERNEL)) {
+		ret = -EAGAIN;
+		goto out_err;
+	}
+	spin_lock_irq(&blkio_cgroup_idr_lock);
+	ret = idr_get_new_above(&blkio_cgroup_id, (void *)biog, 1, &biog->id);
+	spin_unlock_irq(&blkio_cgroup_idr_lock);
+	if (ret == -EAGAIN)
+		goto retry;
+	else if (ret)
+		goto out_err;
+
 	return &biog->css;
+out_err:
+	if (biog)
+		kfree(biog);
+	if (ioc)
+		put_io_context(ioc);
+	return ERR_PTR(ret);
 }
 
 /* Delete the blkio-cgroup. */
@@ -185,10 +207,28 @@ static void blkio_cgroup_destroy(struct 
 	struct blkio_cgroup *biog = cgroup_blkio(cgrp);
 
 	put_io_context(biog->io_context);
-	free_css_id(&blkio_cgroup_subsys, &biog->css);
+
+	spin_lock_irq(&blkio_cgroup_idr_lock);
+	idr_remove(&blkio_cgroup_id, biog->id);
+	spin_unlock_irq(&blkio_cgroup_idr_lock);
+
 	kfree(biog);
 }
 
+static struct blkio_cgroup *find_blkio_cgroup(int id)
+{
+	struct blkio_cgroup *biog;
+	spin_lock_irq(&blkio_cgroup_idr_lock);
+	/*
+	 * It might fail to find A bio-group associated with "id" since it
+	 * is allowed to remove the bio-cgroup even when some of I/O requests
+	 * this group issued haven't completed yet.
+	 */
+	biog = (struct blkio_cgroup *)idr_find(&blkio_cgroup_id, id);
+	spin_unlock_irq(&blkio_cgroup_idr_lock);
+	return biog;
+}
+
 /**
  * get_blkio_cgroup_id() - determine the blkio-cgroup ID
  * @bio:	the &struct bio which describes the I/O
@@ -200,14 +240,11 @@ unsigned long get_blkio_cgroup_id(struct
 {
 	struct page_cgroup *pc;
 	struct page *page = bio_iovec_idx(bio, 0)->bv_page;
-	unsigned long id = 0;
+	int	id = 0;
 
 	pc = lookup_page_cgroup(page);
-	if (pc) {
-		lock_page_cgroup(pc);
-		id = page_cgroup_get_id(pc);
-		unlock_page_cgroup(pc);
-	}
+	if (pc)
+		id = pc->blkio_cgroup_id;
 	return id;
 }
 
@@ -219,21 +256,17 @@ unsigned long get_blkio_cgroup_id(struct
  */
 struct io_context *get_blkio_cgroup_iocontext(struct bio *bio)
 {
-	struct cgroup_subsys_state *css;
-	struct blkio_cgroup *biog;
+	struct blkio_cgroup *biog = NULL;
 	struct io_context *ioc;
-	unsigned long id;
+	int	id = 0;
 
 	id = get_blkio_cgroup_id(bio);
-	rcu_read_lock();
-	css = css_lookup(&blkio_cgroup_subsys, id);
-	if (css)
-		biog = container_of(css, struct blkio_cgroup, css);
-	else
+	if (id)
+		biog = find_blkio_cgroup(id);
+	if (!biog)
 		biog = &default_blkio_cgroup;
 	ioc = biog->io_context;	/* default io_context for this cgroup */
 	atomic_inc(&ioc->refcount);
-	rcu_read_unlock();
 	return ioc;
 }
 
@@ -249,17 +282,15 @@ struct io_context *get_blkio_cgroup_ioco
  */
 struct cgroup *blkio_cgroup_lookup(int id)
 {
-	struct cgroup *cgrp;
-	struct cgroup_subsys_state *css;
+	struct blkio_cgroup *biog = NULL;
 
 	if (blkio_cgroup_disabled())
 		return NULL;
-
-	css = css_lookup(&blkio_cgroup_subsys, id);
-	if (!css)
+	if (id)
+		biog = find_blkio_cgroup(id);
+	if (!biog)
 		return NULL;
-	cgrp = css->cgroup;
-	return cgrp;
+	return biog->css.cgroup;
 }
 EXPORT_SYMBOL(get_blkio_cgroup_iocontext);
 EXPORT_SYMBOL(get_blkio_cgroup_id);
@@ -273,7 +304,7 @@ static u64 blkio_id_read(struct cgroup *
 	rcu_read_lock();
 	id = css_id(&biog->css);
 	rcu_read_unlock();
-	return (u64)id;
+	return (u64) biog->id;
 }
 
 
@@ -296,5 +327,4 @@ struct cgroup_subsys blkio_cgroup_subsys
 	.destroy	= blkio_cgroup_destroy,
 	.populate	= blkio_cgroup_populate,
 	.subsys_id	= blkio_cgroup_subsys_id,
-	.use_id		= 1,
 };
Index: linux-2.6.30-rc3-git3/include/linux/biotrack.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/biotrack.h
+++ linux-2.6.30-rc3-git3/include/linux/biotrack.h
@@ -12,6 +12,7 @@ struct block_device;
 
 struct blkio_cgroup {
 	struct cgroup_subsys_state css;
+	int id;
 	struct io_context *io_context;	/* default io_context */
 /*	struct radix_tree_root io_context_root; per device io_context */
 };
@@ -24,9 +25,7 @@ struct blkio_cgroup {
  */
 static inline void __init_blkio_page_cgroup(struct page_cgroup *pc)
 {
-	lock_page_cgroup(pc);
-	page_cgroup_set_id(pc, 0);
-	unlock_page_cgroup(pc);
+	pc->blkio_cgroup_id = 0;
 }
 
 /**
Index: linux-2.6.30-rc3-git3/include/linux/page_cgroup.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/page_cgroup.h
+++ linux-2.6.30-rc3-git3/include/linux/page_cgroup.h
@@ -17,6 +17,9 @@ struct page_cgroup {
 	struct mem_cgroup *mem_cgroup;
 	struct list_head lru;		/* per cgroup LRU list */
 #endif
+#ifdef CONFIG_CGROUP_BLKIO
+	int blkio_cgroup_id;
+#endif
 };
 
 void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat);
@@ -124,27 +127,4 @@ static inline void swap_cgroup_swapoff(i
 }
 
 #endif
-
-#ifdef CONFIG_CGROUP_BLKIO
-/*
- * use lower 16 bits for flags and reserve the rest for the page tracking id
- */
-#define PCG_TRACKING_ID_SHIFT	(16)
-#define PCG_TRACKING_ID_BITS \
-	(8 * sizeof(unsigned long) - PCG_TRACKING_ID_SHIFT)
-
-/* NOTE: must be called with page_cgroup() held */
-static inline unsigned long page_cgroup_get_id(struct page_cgroup *pc)
-{
-	return pc->flags >> PCG_TRACKING_ID_SHIFT;
-}
-
-/* NOTE: must be called with page_cgroup() held */
-static inline void page_cgroup_set_id(struct page_cgroup *pc, unsigned long id)
-{
-	WARN_ON(id >= (1UL << PCG_TRACKING_ID_BITS));
-	pc->flags &= (1UL << PCG_TRACKING_ID_SHIFT) - 1;
-	pc->flags |= (unsigned long)(id << PCG_TRACKING_ID_SHIFT);
-}
-#endif
 #endif

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 7/7] blkio-cgroup: Fast page tracking
  2009-04-28 12:47         ` Ryo Tsuruta
       [not found]           ` <20090428.214705.112621770.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
@ 2009-04-28 12:47           ` Ryo Tsuruta
  2009-04-28 12:47           ` Ryo Tsuruta
  2 siblings, 0 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-04-28 12:47 UTC (permalink / raw)
  To: linux-kernel, dm-devel, containers, virtualization, xen-devel

This is an extra patch which reduces the overhead of IO tracking but
increases the size of struct page_cgroup.

Based on 2.6.30-rc3-git3
Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
Signed-off-by: Ryo Tsuruta <ryov@valinux.co.jp>

---
 include/linux/biotrack.h    |    5 -
 include/linux/page_cgroup.h |   26 --------
 mm/biotrack.c               |  134 ++++++++++++++++++++++++++------------------
 3 files changed, 87 insertions(+), 78 deletions(-)

Index: linux-2.6.30-rc3-git3/mm/biotrack.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/biotrack.c
+++ linux-2.6.30-rc3-git3/mm/biotrack.c
@@ -3,9 +3,6 @@
  * Copyright (C) VA Linux Systems Japan, 2008-2009
  * Developed by Hirokazu Takahashi <taka@valinux.co.jp>
  *
- * Copyright (C) 2008 Andrea Righi <righi.andrea@gmail.com>
- * Use part of page_cgroup->flags to store blkio-cgroup ID.
- *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
  * the Free Software Foundation; either version 2 of the License, or
@@ -20,6 +17,7 @@
 #include <linux/module.h>
 #include <linux/smp.h>
 #include <linux/bit_spinlock.h>
+#include <linux/idr.h>
 #include <linux/blkdev.h>
 #include <linux/biotrack.h>
 #include <linux/mm_inline.h>
@@ -45,8 +43,11 @@ static inline struct blkio_cgroup *blkio
 					struct blkio_cgroup, css);
 }
 
+static struct idr blkio_cgroup_id;
+static DEFINE_SPINLOCK(blkio_cgroup_idr_lock);
 static struct io_context default_blkio_io_context;
 static struct blkio_cgroup default_blkio_cgroup = {
+	.id		= 0,
 	.io_context	= &default_blkio_io_context,
 };
 
@@ -61,7 +62,6 @@ void blkio_cgroup_set_owner(struct page 
 {
 	struct blkio_cgroup *biog;
 	struct page_cgroup *pc;
-	unsigned long id;
 
 	if (blkio_cgroup_disabled())
 		return;
@@ -69,29 +69,27 @@ void blkio_cgroup_set_owner(struct page 
 	if (unlikely(!pc))
 		return;
 
-	lock_page_cgroup(pc);
-	page_cgroup_set_id(pc, 0);	/* 0: default blkio_cgroup id */
-	unlock_page_cgroup(pc);
+	pc->blkio_cgroup_id = 0;	/* 0: default blkio_cgroup id */
 	if (!mm)
 		return;
 
+	/*
+	 * Locking "pc" isn't necessary here since the current process is
+	 * the only one that can access the members related to blkio_cgroup.
+	 */
 	rcu_read_lock();
 	biog = blkio_cgroup_from_task(rcu_dereference(mm->owner));
-	if (unlikely(!biog)) {
-		rcu_read_unlock();
-		return;
-	}
+	if (unlikely(!biog))
+		goto out;
 	/*
 	 * css_get(&bio->css) isn't called to increment the reference
 	 * count of this blkio_cgroup "biog" so the css_id might turn
 	 * invalid even if this page is still active.
 	 * This approach is chosen to minimize the overhead.
 	 */
-	id = css_id(&biog->css);
+	pc->blkio_cgroup_id = biog->id;
+out:
 	rcu_read_unlock();
-	lock_page_cgroup(pc);
-	page_cgroup_set_id(pc, id);
-	unlock_page_cgroup(pc);
 }
 
 /**
@@ -103,6 +101,13 @@ void blkio_cgroup_set_owner(struct page 
  */
 void blkio_cgroup_reset_owner(struct page *page, struct mm_struct *mm)
 {
+	/*
+	 * A little trick:
+	 * Just call blkio_cgroup_set_owner() for pages which are already
+	 * active since the blkio_cgroup_id member of page_cgroup can be
+	 * updated without any locks. This is because an integer type of
+	 * variable can be set a new value at once on modern cpus.
+	 */
 	blkio_cgroup_set_owner(page, mm);
 }
 
@@ -133,7 +138,6 @@ void blkio_cgroup_reset_owner_pagedirty(
 void blkio_cgroup_copy_owner(struct page *npage, struct page *opage)
 {
 	struct page_cgroup *npc, *opc;
-	unsigned long id;
 
 	if (blkio_cgroup_disabled())
 		return;
@@ -144,12 +148,11 @@ void blkio_cgroup_copy_owner(struct page
 	if (unlikely(!opc))
 		return;
 
-	lock_page_cgroup(opc);
-	lock_page_cgroup(npc);
-	id = page_cgroup_get_id(opc);
-	page_cgroup_set_id(npc, id);
-	unlock_page_cgroup(npc);
-	unlock_page_cgroup(opc);
+	/*
+	 * Do this without any locks. The reason is the same as
+	 * blkio_cgroup_reset_owner().
+	 */
+	npc->blkio_cgroup_id = opc->blkio_cgroup_id;
 }
 
 /* Create a new blkio-cgroup. */
@@ -158,25 +161,44 @@ blkio_cgroup_create(struct cgroup_subsys
 {
 	struct blkio_cgroup *biog;
 	struct io_context *ioc;
+	int ret;
 
 	if (!cgrp->parent) {
 		biog = &default_blkio_cgroup;
 		init_io_context(biog->io_context);
 		/* Increment the referrence count not to be released ever. */
 		atomic_inc(&biog->io_context->refcount);
+		idr_init(&blkio_cgroup_id);
 		return &biog->css;
 	}
 
 	biog = kzalloc(sizeof(*biog), GFP_KERNEL);
-	if (!biog)
-		return ERR_PTR(-ENOMEM);
 	ioc = alloc_io_context(GFP_KERNEL, -1);
-	if (!ioc) {
-		kfree(biog);
-		return ERR_PTR(-ENOMEM);
+	if (!ioc || !biog) {
+		ret = -ENOMEM;
+		goto out_err;
 	}
 	biog->io_context = ioc;
+retry:
+	if (!idr_pre_get(&blkio_cgroup_id, GFP_KERNEL)) {
+		ret = -EAGAIN;
+		goto out_err;
+	}
+	spin_lock_irq(&blkio_cgroup_idr_lock);
+	ret = idr_get_new_above(&blkio_cgroup_id, (void *)biog, 1, &biog->id);
+	spin_unlock_irq(&blkio_cgroup_idr_lock);
+	if (ret == -EAGAIN)
+		goto retry;
+	else if (ret)
+		goto out_err;
+
 	return &biog->css;
+out_err:
+	if (biog)
+		kfree(biog);
+	if (ioc)
+		put_io_context(ioc);
+	return ERR_PTR(ret);
 }
 
 /* Delete the blkio-cgroup. */
@@ -185,10 +207,28 @@ static void blkio_cgroup_destroy(struct 
 	struct blkio_cgroup *biog = cgroup_blkio(cgrp);
 
 	put_io_context(biog->io_context);
-	free_css_id(&blkio_cgroup_subsys, &biog->css);
+
+	spin_lock_irq(&blkio_cgroup_idr_lock);
+	idr_remove(&blkio_cgroup_id, biog->id);
+	spin_unlock_irq(&blkio_cgroup_idr_lock);
+
 	kfree(biog);
 }
 
+static struct blkio_cgroup *find_blkio_cgroup(int id)
+{
+	struct blkio_cgroup *biog;
+	spin_lock_irq(&blkio_cgroup_idr_lock);
+	/*
+	 * It might fail to find A bio-group associated with "id" since it
+	 * is allowed to remove the bio-cgroup even when some of I/O requests
+	 * this group issued haven't completed yet.
+	 */
+	biog = (struct blkio_cgroup *)idr_find(&blkio_cgroup_id, id);
+	spin_unlock_irq(&blkio_cgroup_idr_lock);
+	return biog;
+}
+
 /**
  * get_blkio_cgroup_id() - determine the blkio-cgroup ID
  * @bio:	the &struct bio which describes the I/O
@@ -200,14 +240,11 @@ unsigned long get_blkio_cgroup_id(struct
 {
 	struct page_cgroup *pc;
 	struct page *page = bio_iovec_idx(bio, 0)->bv_page;
-	unsigned long id = 0;
+	int	id = 0;
 
 	pc = lookup_page_cgroup(page);
-	if (pc) {
-		lock_page_cgroup(pc);
-		id = page_cgroup_get_id(pc);
-		unlock_page_cgroup(pc);
-	}
+	if (pc)
+		id = pc->blkio_cgroup_id;
 	return id;
 }
 
@@ -219,21 +256,17 @@ unsigned long get_blkio_cgroup_id(struct
  */
 struct io_context *get_blkio_cgroup_iocontext(struct bio *bio)
 {
-	struct cgroup_subsys_state *css;
-	struct blkio_cgroup *biog;
+	struct blkio_cgroup *biog = NULL;
 	struct io_context *ioc;
-	unsigned long id;
+	int	id = 0;
 
 	id = get_blkio_cgroup_id(bio);
-	rcu_read_lock();
-	css = css_lookup(&blkio_cgroup_subsys, id);
-	if (css)
-		biog = container_of(css, struct blkio_cgroup, css);
-	else
+	if (id)
+		biog = find_blkio_cgroup(id);
+	if (!biog)
 		biog = &default_blkio_cgroup;
 	ioc = biog->io_context;	/* default io_context for this cgroup */
 	atomic_inc(&ioc->refcount);
-	rcu_read_unlock();
 	return ioc;
 }
 
@@ -249,17 +282,15 @@ struct io_context *get_blkio_cgroup_ioco
  */
 struct cgroup *blkio_cgroup_lookup(int id)
 {
-	struct cgroup *cgrp;
-	struct cgroup_subsys_state *css;
+	struct blkio_cgroup *biog = NULL;
 
 	if (blkio_cgroup_disabled())
 		return NULL;
-
-	css = css_lookup(&blkio_cgroup_subsys, id);
-	if (!css)
+	if (id)
+		biog = find_blkio_cgroup(id);
+	if (!biog)
 		return NULL;
-	cgrp = css->cgroup;
-	return cgrp;
+	return biog->css.cgroup;
 }
 EXPORT_SYMBOL(get_blkio_cgroup_iocontext);
 EXPORT_SYMBOL(get_blkio_cgroup_id);
@@ -273,7 +304,7 @@ static u64 blkio_id_read(struct cgroup *
 	rcu_read_lock();
 	id = css_id(&biog->css);
 	rcu_read_unlock();
-	return (u64)id;
+	return (u64) biog->id;
 }
 
 
@@ -296,5 +327,4 @@ struct cgroup_subsys blkio_cgroup_subsys
 	.destroy	= blkio_cgroup_destroy,
 	.populate	= blkio_cgroup_populate,
 	.subsys_id	= blkio_cgroup_subsys_id,
-	.use_id		= 1,
 };
Index: linux-2.6.30-rc3-git3/include/linux/biotrack.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/biotrack.h
+++ linux-2.6.30-rc3-git3/include/linux/biotrack.h
@@ -12,6 +12,7 @@ struct block_device;
 
 struct blkio_cgroup {
 	struct cgroup_subsys_state css;
+	int id;
 	struct io_context *io_context;	/* default io_context */
 /*	struct radix_tree_root io_context_root; per device io_context */
 };
@@ -24,9 +25,7 @@ struct blkio_cgroup {
  */
 static inline void __init_blkio_page_cgroup(struct page_cgroup *pc)
 {
-	lock_page_cgroup(pc);
-	page_cgroup_set_id(pc, 0);
-	unlock_page_cgroup(pc);
+	pc->blkio_cgroup_id = 0;
 }
 
 /**
Index: linux-2.6.30-rc3-git3/include/linux/page_cgroup.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/page_cgroup.h
+++ linux-2.6.30-rc3-git3/include/linux/page_cgroup.h
@@ -17,6 +17,9 @@ struct page_cgroup {
 	struct mem_cgroup *mem_cgroup;
 	struct list_head lru;		/* per cgroup LRU list */
 #endif
+#ifdef CONFIG_CGROUP_BLKIO
+	int blkio_cgroup_id;
+#endif
 };
 
 void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat);
@@ -124,27 +127,4 @@ static inline void swap_cgroup_swapoff(i
 }
 
 #endif
-
-#ifdef CONFIG_CGROUP_BLKIO
-/*
- * use lower 16 bits for flags and reserve the rest for the page tracking id
- */
-#define PCG_TRACKING_ID_SHIFT	(16)
-#define PCG_TRACKING_ID_BITS \
-	(8 * sizeof(unsigned long) - PCG_TRACKING_ID_SHIFT)
-
-/* NOTE: must be called with page_cgroup() held */
-static inline unsigned long page_cgroup_get_id(struct page_cgroup *pc)
-{
-	return pc->flags >> PCG_TRACKING_ID_SHIFT;
-}
-
-/* NOTE: must be called with page_cgroup() held */
-static inline void page_cgroup_set_id(struct page_cgroup *pc, unsigned long id)
-{
-	WARN_ON(id >= (1UL << PCG_TRACKING_ID_BITS));
-	pc->flags &= (1UL << PCG_TRACKING_ID_SHIFT) - 1;
-	pc->flags |= (unsigned long)(id << PCG_TRACKING_ID_SHIFT);
-}
-#endif
 #endif

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 7/7] blkio-cgroup: Fast page tracking
  2009-04-28 12:47         ` Ryo Tsuruta
       [not found]           ` <20090428.214705.112621770.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
  2009-04-28 12:47           ` Ryo Tsuruta
@ 2009-04-28 12:47           ` Ryo Tsuruta
  2 siblings, 0 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-04-28 12:47 UTC (permalink / raw)
  To: linux-kernel, dm-devel, containers, virtualization, xen-devel

This is an extra patch which reduces the overhead of IO tracking but
increases the size of struct page_cgroup.

Based on 2.6.30-rc3-git3
Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
Signed-off-by: Ryo Tsuruta <ryov@valinux.co.jp>

---
 include/linux/biotrack.h    |    5 -
 include/linux/page_cgroup.h |   26 --------
 mm/biotrack.c               |  134 ++++++++++++++++++++++++++------------------
 3 files changed, 87 insertions(+), 78 deletions(-)

Index: linux-2.6.30-rc3-git3/mm/biotrack.c
===================================================================
--- linux-2.6.30-rc3-git3.orig/mm/biotrack.c
+++ linux-2.6.30-rc3-git3/mm/biotrack.c
@@ -3,9 +3,6 @@
  * Copyright (C) VA Linux Systems Japan, 2008-2009
  * Developed by Hirokazu Takahashi <taka@valinux.co.jp>
  *
- * Copyright (C) 2008 Andrea Righi <righi.andrea@gmail.com>
- * Use part of page_cgroup->flags to store blkio-cgroup ID.
- *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
  * the Free Software Foundation; either version 2 of the License, or
@@ -20,6 +17,7 @@
 #include <linux/module.h>
 #include <linux/smp.h>
 #include <linux/bit_spinlock.h>
+#include <linux/idr.h>
 #include <linux/blkdev.h>
 #include <linux/biotrack.h>
 #include <linux/mm_inline.h>
@@ -45,8 +43,11 @@ static inline struct blkio_cgroup *blkio
 					struct blkio_cgroup, css);
 }
 
+static struct idr blkio_cgroup_id;
+static DEFINE_SPINLOCK(blkio_cgroup_idr_lock);
 static struct io_context default_blkio_io_context;
 static struct blkio_cgroup default_blkio_cgroup = {
+	.id		= 0,
 	.io_context	= &default_blkio_io_context,
 };
 
@@ -61,7 +62,6 @@ void blkio_cgroup_set_owner(struct page 
 {
 	struct blkio_cgroup *biog;
 	struct page_cgroup *pc;
-	unsigned long id;
 
 	if (blkio_cgroup_disabled())
 		return;
@@ -69,29 +69,27 @@ void blkio_cgroup_set_owner(struct page 
 	if (unlikely(!pc))
 		return;
 
-	lock_page_cgroup(pc);
-	page_cgroup_set_id(pc, 0);	/* 0: default blkio_cgroup id */
-	unlock_page_cgroup(pc);
+	pc->blkio_cgroup_id = 0;	/* 0: default blkio_cgroup id */
 	if (!mm)
 		return;
 
+	/*
+	 * Locking "pc" isn't necessary here since the current process is
+	 * the only one that can access the members related to blkio_cgroup.
+	 */
 	rcu_read_lock();
 	biog = blkio_cgroup_from_task(rcu_dereference(mm->owner));
-	if (unlikely(!biog)) {
-		rcu_read_unlock();
-		return;
-	}
+	if (unlikely(!biog))
+		goto out;
 	/*
 	 * css_get(&bio->css) isn't called to increment the reference
 	 * count of this blkio_cgroup "biog" so the css_id might turn
 	 * invalid even if this page is still active.
 	 * This approach is chosen to minimize the overhead.
 	 */
-	id = css_id(&biog->css);
+	pc->blkio_cgroup_id = biog->id;
+out:
 	rcu_read_unlock();
-	lock_page_cgroup(pc);
-	page_cgroup_set_id(pc, id);
-	unlock_page_cgroup(pc);
 }
 
 /**
@@ -103,6 +101,13 @@ void blkio_cgroup_set_owner(struct page 
  */
 void blkio_cgroup_reset_owner(struct page *page, struct mm_struct *mm)
 {
+	/*
+	 * A little trick:
+	 * Just call blkio_cgroup_set_owner() for pages which are already
+	 * active since the blkio_cgroup_id member of page_cgroup can be
+	 * updated without any locks. This is because an integer type of
+	 * variable can be set a new value at once on modern cpus.
+	 */
 	blkio_cgroup_set_owner(page, mm);
 }
 
@@ -133,7 +138,6 @@ void blkio_cgroup_reset_owner_pagedirty(
 void blkio_cgroup_copy_owner(struct page *npage, struct page *opage)
 {
 	struct page_cgroup *npc, *opc;
-	unsigned long id;
 
 	if (blkio_cgroup_disabled())
 		return;
@@ -144,12 +148,11 @@ void blkio_cgroup_copy_owner(struct page
 	if (unlikely(!opc))
 		return;
 
-	lock_page_cgroup(opc);
-	lock_page_cgroup(npc);
-	id = page_cgroup_get_id(opc);
-	page_cgroup_set_id(npc, id);
-	unlock_page_cgroup(npc);
-	unlock_page_cgroup(opc);
+	/*
+	 * Do this without any locks. The reason is the same as
+	 * blkio_cgroup_reset_owner().
+	 */
+	npc->blkio_cgroup_id = opc->blkio_cgroup_id;
 }
 
 /* Create a new blkio-cgroup. */
@@ -158,25 +161,44 @@ blkio_cgroup_create(struct cgroup_subsys
 {
 	struct blkio_cgroup *biog;
 	struct io_context *ioc;
+	int ret;
 
 	if (!cgrp->parent) {
 		biog = &default_blkio_cgroup;
 		init_io_context(biog->io_context);
 		/* Increment the referrence count not to be released ever. */
 		atomic_inc(&biog->io_context->refcount);
+		idr_init(&blkio_cgroup_id);
 		return &biog->css;
 	}
 
 	biog = kzalloc(sizeof(*biog), GFP_KERNEL);
-	if (!biog)
-		return ERR_PTR(-ENOMEM);
 	ioc = alloc_io_context(GFP_KERNEL, -1);
-	if (!ioc) {
-		kfree(biog);
-		return ERR_PTR(-ENOMEM);
+	if (!ioc || !biog) {
+		ret = -ENOMEM;
+		goto out_err;
 	}
 	biog->io_context = ioc;
+retry:
+	if (!idr_pre_get(&blkio_cgroup_id, GFP_KERNEL)) {
+		ret = -EAGAIN;
+		goto out_err;
+	}
+	spin_lock_irq(&blkio_cgroup_idr_lock);
+	ret = idr_get_new_above(&blkio_cgroup_id, (void *)biog, 1, &biog->id);
+	spin_unlock_irq(&blkio_cgroup_idr_lock);
+	if (ret == -EAGAIN)
+		goto retry;
+	else if (ret)
+		goto out_err;
+
 	return &biog->css;
+out_err:
+	if (biog)
+		kfree(biog);
+	if (ioc)
+		put_io_context(ioc);
+	return ERR_PTR(ret);
 }
 
 /* Delete the blkio-cgroup. */
@@ -185,10 +207,28 @@ static void blkio_cgroup_destroy(struct 
 	struct blkio_cgroup *biog = cgroup_blkio(cgrp);
 
 	put_io_context(biog->io_context);
-	free_css_id(&blkio_cgroup_subsys, &biog->css);
+
+	spin_lock_irq(&blkio_cgroup_idr_lock);
+	idr_remove(&blkio_cgroup_id, biog->id);
+	spin_unlock_irq(&blkio_cgroup_idr_lock);
+
 	kfree(biog);
 }
 
+static struct blkio_cgroup *find_blkio_cgroup(int id)
+{
+	struct blkio_cgroup *biog;
+	spin_lock_irq(&blkio_cgroup_idr_lock);
+	/*
+	 * It might fail to find A bio-group associated with "id" since it
+	 * is allowed to remove the bio-cgroup even when some of I/O requests
+	 * this group issued haven't completed yet.
+	 */
+	biog = (struct blkio_cgroup *)idr_find(&blkio_cgroup_id, id);
+	spin_unlock_irq(&blkio_cgroup_idr_lock);
+	return biog;
+}
+
 /**
  * get_blkio_cgroup_id() - determine the blkio-cgroup ID
  * @bio:	the &struct bio which describes the I/O
@@ -200,14 +240,11 @@ unsigned long get_blkio_cgroup_id(struct
 {
 	struct page_cgroup *pc;
 	struct page *page = bio_iovec_idx(bio, 0)->bv_page;
-	unsigned long id = 0;
+	int	id = 0;
 
 	pc = lookup_page_cgroup(page);
-	if (pc) {
-		lock_page_cgroup(pc);
-		id = page_cgroup_get_id(pc);
-		unlock_page_cgroup(pc);
-	}
+	if (pc)
+		id = pc->blkio_cgroup_id;
 	return id;
 }
 
@@ -219,21 +256,17 @@ unsigned long get_blkio_cgroup_id(struct
  */
 struct io_context *get_blkio_cgroup_iocontext(struct bio *bio)
 {
-	struct cgroup_subsys_state *css;
-	struct blkio_cgroup *biog;
+	struct blkio_cgroup *biog = NULL;
 	struct io_context *ioc;
-	unsigned long id;
+	int	id = 0;
 
 	id = get_blkio_cgroup_id(bio);
-	rcu_read_lock();
-	css = css_lookup(&blkio_cgroup_subsys, id);
-	if (css)
-		biog = container_of(css, struct blkio_cgroup, css);
-	else
+	if (id)
+		biog = find_blkio_cgroup(id);
+	if (!biog)
 		biog = &default_blkio_cgroup;
 	ioc = biog->io_context;	/* default io_context for this cgroup */
 	atomic_inc(&ioc->refcount);
-	rcu_read_unlock();
 	return ioc;
 }
 
@@ -249,17 +282,15 @@ struct io_context *get_blkio_cgroup_ioco
  */
 struct cgroup *blkio_cgroup_lookup(int id)
 {
-	struct cgroup *cgrp;
-	struct cgroup_subsys_state *css;
+	struct blkio_cgroup *biog = NULL;
 
 	if (blkio_cgroup_disabled())
 		return NULL;
-
-	css = css_lookup(&blkio_cgroup_subsys, id);
-	if (!css)
+	if (id)
+		biog = find_blkio_cgroup(id);
+	if (!biog)
 		return NULL;
-	cgrp = css->cgroup;
-	return cgrp;
+	return biog->css.cgroup;
 }
 EXPORT_SYMBOL(get_blkio_cgroup_iocontext);
 EXPORT_SYMBOL(get_blkio_cgroup_id);
@@ -273,7 +304,7 @@ static u64 blkio_id_read(struct cgroup *
 	rcu_read_lock();
 	id = css_id(&biog->css);
 	rcu_read_unlock();
-	return (u64)id;
+	return (u64) biog->id;
 }
 
 
@@ -296,5 +327,4 @@ struct cgroup_subsys blkio_cgroup_subsys
 	.destroy	= blkio_cgroup_destroy,
 	.populate	= blkio_cgroup_populate,
 	.subsys_id	= blkio_cgroup_subsys_id,
-	.use_id		= 1,
 };
Index: linux-2.6.30-rc3-git3/include/linux/biotrack.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/biotrack.h
+++ linux-2.6.30-rc3-git3/include/linux/biotrack.h
@@ -12,6 +12,7 @@ struct block_device;
 
 struct blkio_cgroup {
 	struct cgroup_subsys_state css;
+	int id;
 	struct io_context *io_context;	/* default io_context */
 /*	struct radix_tree_root io_context_root; per device io_context */
 };
@@ -24,9 +25,7 @@ struct blkio_cgroup {
  */
 static inline void __init_blkio_page_cgroup(struct page_cgroup *pc)
 {
-	lock_page_cgroup(pc);
-	page_cgroup_set_id(pc, 0);
-	unlock_page_cgroup(pc);
+	pc->blkio_cgroup_id = 0;
 }
 
 /**
Index: linux-2.6.30-rc3-git3/include/linux/page_cgroup.h
===================================================================
--- linux-2.6.30-rc3-git3.orig/include/linux/page_cgroup.h
+++ linux-2.6.30-rc3-git3/include/linux/page_cgroup.h
@@ -17,6 +17,9 @@ struct page_cgroup {
 	struct mem_cgroup *mem_cgroup;
 	struct list_head lru;		/* per cgroup LRU list */
 #endif
+#ifdef CONFIG_CGROUP_BLKIO
+	int blkio_cgroup_id;
+#endif
 };
 
 void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat);
@@ -124,27 +127,4 @@ static inline void swap_cgroup_swapoff(i
 }
 
 #endif
-
-#ifdef CONFIG_CGROUP_BLKIO
-/*
- * use lower 16 bits for flags and reserve the rest for the page tracking id
- */
-#define PCG_TRACKING_ID_SHIFT	(16)
-#define PCG_TRACKING_ID_BITS \
-	(8 * sizeof(unsigned long) - PCG_TRACKING_ID_SHIFT)
-
-/* NOTE: must be called with page_cgroup() held */
-static inline unsigned long page_cgroup_get_id(struct page_cgroup *pc)
-{
-	return pc->flags >> PCG_TRACKING_ID_SHIFT;
-}
-
-/* NOTE: must be called with page_cgroup() held */
-static inline void page_cgroup_set_id(struct page_cgroup *pc, unsigned long id)
-{
-	WARN_ON(id >= (1UL << PCG_TRACKING_ID_BITS));
-	pc->flags &= (1UL << PCG_TRACKING_ID_SHIFT) - 1;
-	pc->flags |= (unsigned long)(id << PCG_TRACKING_ID_SHIFT);
-}
-#endif
 #endif

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup
       [not found]       ` <20090428.214604.104034341.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
  2009-04-28 12:46         ` [PATCH 5/7] blkio-cgroup: Page tracking hooks Ryo Tsuruta
@ 2009-05-04 12:05         ` Andrea Righi
  1 sibling, 0 replies; 30+ messages in thread
From: Andrea Righi @ 2009-05-04 12:05 UTC (permalink / raw)
  To: Ryo Tsuruta
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA,
	xen-devel-GuqFBffKawuULHF6PoxzQEEOCMrvLtNR,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

On Tue, Apr 28, 2009 at 09:46:04PM +0900, Ryo Tsuruta wrote:
> The body of bio-cgroup.
> 
> Based on 2.6.30-rc3-git3
> Signed-off-by: Hirokazu Takahashi <taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
> Signed-off-by: Ryo Tsuruta <ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>

Hi Ryo,

few minor coding style issues reported by checkpatch.pl:

WARNING: line over 80 characters
#138: FILE: include/linux/biotrack.h:67:
+static inline void blkio_cgroup_set_owner(struct page *page, struct mm_struct *mm)

WARNING: line over 80 characters
#152: FILE: include/linux/biotrack.h:81:
+static inline void blkio_cgroup_copy_owner(struct page *page, struct page *opage)

WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
#513: FILE: mm/biotrack.c:265:
+EXPORT_SYMBOL(get_blkio_cgroup_id);

WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
#514: FILE: mm/biotrack.c:266:
+EXPORT_SYMBOL(blkio_cgroup_lookup);

total: 0 errors, 4 warnings, 515 lines checked

-Andrea

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup
  2009-04-28 12:46     ` [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup Ryo Tsuruta
@ 2009-05-04 12:05         ` Andrea Righi
       [not found]       ` <20090428.214604.104034341.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
                           ` (3 subsequent siblings)
  4 siblings, 0 replies; 30+ messages in thread
From: Andrea Righi @ 2009-05-04 12:05 UTC (permalink / raw)
  To: Ryo Tsuruta; +Cc: linux-kernel, dm-devel, containers, virtualization, xen-devel

On Tue, Apr 28, 2009 at 09:46:04PM +0900, Ryo Tsuruta wrote:
> The body of bio-cgroup.
> 
> Based on 2.6.30-rc3-git3
> Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
> Signed-off-by: Ryo Tsuruta <ryov@valinux.co.jp>

Hi Ryo,

few minor coding style issues reported by checkpatch.pl:

WARNING: line over 80 characters
#138: FILE: include/linux/biotrack.h:67:
+static inline void blkio_cgroup_set_owner(struct page *page, struct mm_struct *mm)

WARNING: line over 80 characters
#152: FILE: include/linux/biotrack.h:81:
+static inline void blkio_cgroup_copy_owner(struct page *page, struct page *opage)

WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
#513: FILE: mm/biotrack.c:265:
+EXPORT_SYMBOL(get_blkio_cgroup_id);

WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
#514: FILE: mm/biotrack.c:266:
+EXPORT_SYMBOL(blkio_cgroup_lookup);

total: 0 errors, 4 warnings, 515 lines checked

-Andrea

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup
  2009-04-28 12:46     ` [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup Ryo Tsuruta
                         ` (2 preceding siblings ...)
  2009-04-28 12:46       ` [PATCH 5/7] blkio-cgroup: Page tracking hooks Ryo Tsuruta
@ 2009-05-04 12:05       ` Andrea Righi
  2009-05-04 12:05         ` Andrea Righi
  4 siblings, 0 replies; 30+ messages in thread
From: Andrea Righi @ 2009-05-04 12:05 UTC (permalink / raw)
  To: Ryo Tsuruta; +Cc: containers, dm-devel, xen-devel, linux-kernel, virtualization

On Tue, Apr 28, 2009 at 09:46:04PM +0900, Ryo Tsuruta wrote:
> The body of bio-cgroup.
> 
> Based on 2.6.30-rc3-git3
> Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
> Signed-off-by: Ryo Tsuruta <ryov@valinux.co.jp>

Hi Ryo,

few minor coding style issues reported by checkpatch.pl:

WARNING: line over 80 characters
#138: FILE: include/linux/biotrack.h:67:
+static inline void blkio_cgroup_set_owner(struct page *page, struct mm_struct *mm)

WARNING: line over 80 characters
#152: FILE: include/linux/biotrack.h:81:
+static inline void blkio_cgroup_copy_owner(struct page *page, struct page *opage)

WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
#513: FILE: mm/biotrack.c:265:
+EXPORT_SYMBOL(get_blkio_cgroup_id);

WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
#514: FILE: mm/biotrack.c:266:
+EXPORT_SYMBOL(blkio_cgroup_lookup);

total: 0 errors, 4 warnings, 515 lines checked

-Andrea

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup
@ 2009-05-04 12:05         ` Andrea Righi
  0 siblings, 0 replies; 30+ messages in thread
From: Andrea Righi @ 2009-05-04 12:05 UTC (permalink / raw)
  To: Ryo Tsuruta; +Cc: containers, dm-devel, xen-devel, linux-kernel, virtualization

On Tue, Apr 28, 2009 at 09:46:04PM +0900, Ryo Tsuruta wrote:
> The body of bio-cgroup.
> 
> Based on 2.6.30-rc3-git3
> Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
> Signed-off-by: Ryo Tsuruta <ryov@valinux.co.jp>

Hi Ryo,

few minor coding style issues reported by checkpatch.pl:

WARNING: line over 80 characters
#138: FILE: include/linux/biotrack.h:67:
+static inline void blkio_cgroup_set_owner(struct page *page, struct mm_struct *mm)

WARNING: line over 80 characters
#152: FILE: include/linux/biotrack.h:81:
+static inline void blkio_cgroup_copy_owner(struct page *page, struct page *opage)

WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
#513: FILE: mm/biotrack.c:265:
+EXPORT_SYMBOL(get_blkio_cgroup_id);

WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
#514: FILE: mm/biotrack.c:266:
+EXPORT_SYMBOL(blkio_cgroup_lookup);

total: 0 errors, 4 warnings, 515 lines checked

-Andrea

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup
  2009-05-04 12:05         ` Andrea Righi
@ 2009-05-04 12:39           ` Alan Cox
  -1 siblings, 0 replies; 30+ messages in thread
From: Alan Cox @ 2009-05-04 12:39 UTC (permalink / raw)
  To: Andrea Righi
  Cc: xen-devel-GuqFBffKawuULHF6PoxzQEEOCMrvLtNR,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA

> few minor coding style issues reported by checkpatch.pl:

checkpatch is a *guide* not some dictator of style. If it's more readable
the way it is, or it's following existing style (eg with the export
symbols in a group at the bottom) then checkpatch is best ignored.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup
@ 2009-05-04 12:39           ` Alan Cox
  0 siblings, 0 replies; 30+ messages in thread
From: Alan Cox @ 2009-05-04 12:39 UTC (permalink / raw)
  To: Andrea Righi
  Cc: Ryo Tsuruta, linux-kernel, dm-devel, containers, virtualization,
	xen-devel

> few minor coding style issues reported by checkpatch.pl:

checkpatch is a *guide* not some dictator of style. If it's more readable
the way it is, or it's following existing style (eg with the export
symbols in a group at the bottom) then checkpatch is best ignored.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup
  2009-05-04 12:05         ` Andrea Righi
  (?)
@ 2009-05-04 12:39         ` Alan Cox
  -1 siblings, 0 replies; 30+ messages in thread
From: Alan Cox @ 2009-05-04 12:39 UTC (permalink / raw)
  To: Andrea Righi
  Cc: xen-devel, containers, linux-kernel, virtualization, dm-devel

> few minor coding style issues reported by checkpatch.pl:

checkpatch is a *guide* not some dictator of style. If it's more readable
the way it is, or it's following existing style (eg with the export
symbols in a group at the bottom) then checkpatch is best ignored.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup
  2009-05-04 12:05         ` Andrea Righi
                           ` (3 preceding siblings ...)
  (?)
@ 2009-05-04 22:43         ` Ryo Tsuruta
  -1 siblings, 0 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-05-04 22:43 UTC (permalink / raw)
  To: righi.andrea-Re5JQEeQqe8AvxtiuMwx3w
  Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA,
	xen-devel-GuqFBffKawuULHF6PoxzQEEOCMrvLtNR,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	virtualization-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA

Hi Andrea,

From: Andrea Righi <righi.andrea-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup
Date: Mon, 4 May 2009 14:05:45 +0200

> On Tue, Apr 28, 2009 at 09:46:04PM +0900, Ryo Tsuruta wrote:
> > The body of bio-cgroup.
> > 
> > Based on 2.6.30-rc3-git3
> > Signed-off-by: Hirokazu Takahashi <taka-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
> > Signed-off-by: Ryo Tsuruta <ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
> 
> Hi Ryo,
> 
> few minor coding style issues reported by checkpatch.pl:

Thanks for reporting these issues. I'll fix them in the next release.

> 
> WARNING: line over 80 characters
> #138: FILE: include/linux/biotrack.h:67:
> +static inline void blkio_cgroup_set_owner(struct page *page, struct mm_struct *mm)
> 
> WARNING: line over 80 characters
> #152: FILE: include/linux/biotrack.h:81:
> +static inline void blkio_cgroup_copy_owner(struct page *page, struct page *opage)
> 
> WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
> #513: FILE: mm/biotrack.c:265:
> +EXPORT_SYMBOL(get_blkio_cgroup_id);
> 
> WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
> #514: FILE: mm/biotrack.c:266:
> +EXPORT_SYMBOL(blkio_cgroup_lookup);
> 
> total: 0 errors, 4 warnings, 515 lines checked
> 
> -Andrea

Thanks,
Ryo Tsuruta

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup
  2009-05-04 12:05         ` Andrea Righi
@ 2009-05-04 22:43           ` Ryo Tsuruta
  -1 siblings, 0 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-05-04 22:43 UTC (permalink / raw)
  To: righi.andrea
  Cc: linux-kernel, dm-devel, containers, virtualization, xen-devel

Hi Andrea,

From: Andrea Righi <righi.andrea@gmail.com>
Subject: Re: [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup
Date: Mon, 4 May 2009 14:05:45 +0200

> On Tue, Apr 28, 2009 at 09:46:04PM +0900, Ryo Tsuruta wrote:
> > The body of bio-cgroup.
> > 
> > Based on 2.6.30-rc3-git3
> > Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
> > Signed-off-by: Ryo Tsuruta <ryov@valinux.co.jp>
> 
> Hi Ryo,
> 
> few minor coding style issues reported by checkpatch.pl:

Thanks for reporting these issues. I'll fix them in the next release.

> 
> WARNING: line over 80 characters
> #138: FILE: include/linux/biotrack.h:67:
> +static inline void blkio_cgroup_set_owner(struct page *page, struct mm_struct *mm)
> 
> WARNING: line over 80 characters
> #152: FILE: include/linux/biotrack.h:81:
> +static inline void blkio_cgroup_copy_owner(struct page *page, struct page *opage)
> 
> WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
> #513: FILE: mm/biotrack.c:265:
> +EXPORT_SYMBOL(get_blkio_cgroup_id);
> 
> WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
> #514: FILE: mm/biotrack.c:266:
> +EXPORT_SYMBOL(blkio_cgroup_lookup);
> 
> total: 0 errors, 4 warnings, 515 lines checked
> 
> -Andrea

Thanks,
Ryo Tsuruta

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup
  2009-05-04 12:05         ` Andrea Righi
                           ` (2 preceding siblings ...)
  (?)
@ 2009-05-04 22:43         ` Ryo Tsuruta
  -1 siblings, 0 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-05-04 22:43 UTC (permalink / raw)
  To: righi.andrea
  Cc: containers, dm-devel, xen-devel, linux-kernel, virtualization

Hi Andrea,

From: Andrea Righi <righi.andrea@gmail.com>
Subject: Re: [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup
Date: Mon, 4 May 2009 14:05:45 +0200

> On Tue, Apr 28, 2009 at 09:46:04PM +0900, Ryo Tsuruta wrote:
> > The body of bio-cgroup.
> > 
> > Based on 2.6.30-rc3-git3
> > Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
> > Signed-off-by: Ryo Tsuruta <ryov@valinux.co.jp>
> 
> Hi Ryo,
> 
> few minor coding style issues reported by checkpatch.pl:

Thanks for reporting these issues. I'll fix them in the next release.

> 
> WARNING: line over 80 characters
> #138: FILE: include/linux/biotrack.h:67:
> +static inline void blkio_cgroup_set_owner(struct page *page, struct mm_struct *mm)
> 
> WARNING: line over 80 characters
> #152: FILE: include/linux/biotrack.h:81:
> +static inline void blkio_cgroup_copy_owner(struct page *page, struct page *opage)
> 
> WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
> #513: FILE: mm/biotrack.c:265:
> +EXPORT_SYMBOL(get_blkio_cgroup_id);
> 
> WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
> #514: FILE: mm/biotrack.c:266:
> +EXPORT_SYMBOL(blkio_cgroup_lookup);
> 
> total: 0 errors, 4 warnings, 515 lines checked
> 
> -Andrea

Thanks,
Ryo Tsuruta

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup
@ 2009-05-04 22:43           ` Ryo Tsuruta
  0 siblings, 0 replies; 30+ messages in thread
From: Ryo Tsuruta @ 2009-05-04 22:43 UTC (permalink / raw)
  To: righi.andrea
  Cc: containers, dm-devel, xen-devel, linux-kernel, virtualization

Hi Andrea,

From: Andrea Righi <righi.andrea@gmail.com>
Subject: Re: [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup
Date: Mon, 4 May 2009 14:05:45 +0200

> On Tue, Apr 28, 2009 at 09:46:04PM +0900, Ryo Tsuruta wrote:
> > The body of bio-cgroup.
> > 
> > Based on 2.6.30-rc3-git3
> > Signed-off-by: Hirokazu Takahashi <taka@valinux.co.jp>
> > Signed-off-by: Ryo Tsuruta <ryov@valinux.co.jp>
> 
> Hi Ryo,
> 
> few minor coding style issues reported by checkpatch.pl:

Thanks for reporting these issues. I'll fix them in the next release.

> 
> WARNING: line over 80 characters
> #138: FILE: include/linux/biotrack.h:67:
> +static inline void blkio_cgroup_set_owner(struct page *page, struct mm_struct *mm)
> 
> WARNING: line over 80 characters
> #152: FILE: include/linux/biotrack.h:81:
> +static inline void blkio_cgroup_copy_owner(struct page *page, struct page *opage)
> 
> WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
> #513: FILE: mm/biotrack.c:265:
> +EXPORT_SYMBOL(get_blkio_cgroup_id);
> 
> WARNING: EXPORT_SYMBOL(foo); should immediately follow its function/variable
> #514: FILE: mm/biotrack.c:266:
> +EXPORT_SYMBOL(blkio_cgroup_lookup);
> 
> total: 0 errors, 4 warnings, 515 lines checked
> 
> -Andrea

Thanks,
Ryo Tsuruta

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2009-05-04 22:44 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-28 12:44 [PATCH 1/7] blkio-cgroup: Introduction Ryo Tsuruta
2009-04-28 12:44 ` [PATCH 2/7] blkio-cgroup: The new page_cgroup framework Ryo Tsuruta
2009-04-28 12:44 ` Ryo Tsuruta
     [not found]   ` <20090428.214457.226801820.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2009-04-28 12:45     ` [PATCH 3/7] blkio-cgroup: Refactoring io-context initialization Ryo Tsuruta
2009-04-28 12:45   ` Ryo Tsuruta
2009-04-28 12:46     ` [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup Ryo Tsuruta
2009-04-28 12:46       ` [PATCH 5/7] blkio-cgroup: Page tracking hooks Ryo Tsuruta
     [not found]         ` <20090428.214632.189725895.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2009-04-28 12:47           ` [PATCH 6/7] blkio-cgroup: Add a cgroup support to dm-ioband Ryo Tsuruta
2009-04-28 12:47         ` Ryo Tsuruta
     [not found]           ` <20090428.214705.112621770.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2009-04-28 12:47             ` [PATCH 7/7] blkio-cgroup: Fast page tracking Ryo Tsuruta
2009-04-28 12:47           ` Ryo Tsuruta
2009-04-28 12:47           ` Ryo Tsuruta
2009-04-28 12:47         ` [PATCH 6/7] blkio-cgroup: Add a cgroup support to dm-ioband Ryo Tsuruta
     [not found]       ` <20090428.214604.104034341.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2009-04-28 12:46         ` [PATCH 5/7] blkio-cgroup: Page tracking hooks Ryo Tsuruta
2009-05-04 12:05         ` [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup Andrea Righi
2009-04-28 12:46       ` [PATCH 5/7] blkio-cgroup: Page tracking hooks Ryo Tsuruta
2009-05-04 12:05       ` [PATCH 4/7] blkio-cgroup: The body of blkio-cgroup Andrea Righi
2009-05-04 12:05       ` Andrea Righi
2009-05-04 12:05         ` Andrea Righi
2009-05-04 12:39         ` Alan Cox
2009-05-04 12:39         ` Alan Cox
2009-05-04 12:39           ` Alan Cox
2009-05-04 22:43         ` Ryo Tsuruta
2009-05-04 22:43         ` Ryo Tsuruta
2009-05-04 22:43         ` Ryo Tsuruta
2009-05-04 22:43           ` Ryo Tsuruta
2009-04-28 12:46     ` Ryo Tsuruta
     [not found]     ` <20090428.214535.71090234.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2009-04-28 12:46       ` Ryo Tsuruta
2009-04-28 12:45   ` [PATCH 3/7] blkio-cgroup: Refactoring io-context initialization Ryo Tsuruta
     [not found] ` <20090428.214423.193684619.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>
2009-04-28 12:44   ` [PATCH 2/7] blkio-cgroup: The new page_cgroup framework Ryo Tsuruta

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.