All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFCv5 0/9] CMA + VCMM integration
@ 2010-09-06  6:33 ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

Hello everyone,

This patchset introduces a draft of a redesign of Zach Pfeffer's
VCMM.  Not all of the functionality of the original VCMM has been
ported into this patchset.  This is mostly meant as RFC.  Moreover,
the code for VCMM implementation in this RFC has not been tested.

CMA has not been changed compared to the previous CMA versions so no
aspects discussed on the list have been addressed yet.

The redesigned VCMM now uses notion of drivers -- a VCM context is
created for each MMU on the platform and each such context is handled
by a VCM driver.  A context (or may contexts) for One-to-One mappings
is created as well and handled with a One-to-One VCM driver.

The patchset introduces a sample (or a template if you will) VCM MMU
driver as well as a VCM CMA One-to-One driver so it is shown by
example how VCM drivers are written.

The VCMM framework proposed by this patchset also introduces
a vcm_make_binding() call which allocates physical memory, creates
virtual address reservation and binds the two together.  This makes
life easier for One-to-One mappings and if device drivers limit their
use of VCM API to a subset of functionality the can work on systems
with or without MMU with no modifications (only the VCM context would
need to change).

Please refer to documentation in the second and seventh patch for more
information regarding CMA and VCMM respectively.

Michal Nazarewicz (9):
  lib: rbtree: rb_root_init() function added
  mm: cma: Contiguous Memory Allocator added
  mm: cma: Added SysFS support
  mm: cma: Added command line parameters support
  mm: cma: Test device and application added
  ARM: cma: Added CMA to Aquila, Goni and c210 universal boards
  mm: vcm: Virtual Contiguous Memory framework added
  mm: vcm: Sample driver added
  mm: vcm: vcm-cma: VCM CMA driver added

 Documentation/00-INDEX                             |    4 +
 .../ABI/testing/sysfs-kernel-mm-contiguous         |   53 +
 Documentation/contiguous-memory.txt                |  623 +++++++++
 Documentation/kernel-parameters.txt                |    7 +
 Documentation/virtual-contiguous-memory.txt        |  866 ++++++++++++
 arch/arm/mach-s5pv210/mach-aquila.c                |   31 +
 arch/arm/mach-s5pv210/mach-goni.c                  |   31 +
 arch/arm/mach-s5pv310/mach-universal_c210.c        |   23 +
 drivers/misc/Kconfig                               |    8 +
 drivers/misc/Makefile                              |    1 +
 drivers/misc/cma-dev.c                             |  202 +++
 include/linux/cma.h                                |  479 +++++++
 include/linux/rbtree.h                             |   11 +
 include/linux/vcm-cma.h                            |   38 +
 include/linux/vcm-drv.h                            |  299 +++++
 include/linux/vcm-sample.h                         |   30 +
 include/linux/vcm.h                                |  275 ++++
 mm/Kconfig                                         |  123 ++
 mm/Makefile                                        |    5 +
 mm/cma-best-fit.c                                  |  407 ++++++
 mm/cma.c                                           | 1377 ++++++++++++++++++++
 mm/vcm-cma.c                                       |   84 ++
 mm/vcm-sample.c                                    |  120 ++
 mm/vcm.c                                           |  932 +++++++++++++
 tools/cma/cma-test.c                               |  386 ++++++
 25 files changed, 6415 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous
 create mode 100644 Documentation/contiguous-memory.txt
 create mode 100644 Documentation/virtual-contiguous-memory.txt
 create mode 100644 drivers/misc/cma-dev.c
 create mode 100644 include/linux/cma.h
 create mode 100644 include/linux/vcm-cma.h
 create mode 100644 include/linux/vcm-drv.h
 create mode 100644 include/linux/vcm-sample.h
 create mode 100644 include/linux/vcm.h
 create mode 100644 mm/cma-best-fit.c
 create mode 100644 mm/cma.c
 create mode 100644 mm/vcm-cma.c
 create mode 100644 mm/vcm-sample.c
 create mode 100644 mm/vcm.c
 create mode 100644 tools/cma/cma-test.c


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [RFCv5 0/9] CMA + VCMM integration
@ 2010-09-06  6:33 ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

Hello everyone,

This patchset introduces a draft of a redesign of Zach Pfeffer's
VCMM.  Not all of the functionality of the original VCMM has been
ported into this patchset.  This is mostly meant as RFC.  Moreover,
the code for VCMM implementation in this RFC has not been tested.

CMA has not been changed compared to the previous CMA versions so no
aspects discussed on the list have been addressed yet.

The redesigned VCMM now uses notion of drivers -- a VCM context is
created for each MMU on the platform and each such context is handled
by a VCM driver.  A context (or may contexts) for One-to-One mappings
is created as well and handled with a One-to-One VCM driver.

The patchset introduces a sample (or a template if you will) VCM MMU
driver as well as a VCM CMA One-to-One driver so it is shown by
example how VCM drivers are written.

The VCMM framework proposed by this patchset also introduces
a vcm_make_binding() call which allocates physical memory, creates
virtual address reservation and binds the two together.  This makes
life easier for One-to-One mappings and if device drivers limit their
use of VCM API to a subset of functionality the can work on systems
with or without MMU with no modifications (only the VCM context would
need to change).

Please refer to documentation in the second and seventh patch for more
information regarding CMA and VCMM respectively.

Michal Nazarewicz (9):
  lib: rbtree: rb_root_init() function added
  mm: cma: Contiguous Memory Allocator added
  mm: cma: Added SysFS support
  mm: cma: Added command line parameters support
  mm: cma: Test device and application added
  ARM: cma: Added CMA to Aquila, Goni and c210 universal boards
  mm: vcm: Virtual Contiguous Memory framework added
  mm: vcm: Sample driver added
  mm: vcm: vcm-cma: VCM CMA driver added

 Documentation/00-INDEX                             |    4 +
 .../ABI/testing/sysfs-kernel-mm-contiguous         |   53 +
 Documentation/contiguous-memory.txt                |  623 +++++++++
 Documentation/kernel-parameters.txt                |    7 +
 Documentation/virtual-contiguous-memory.txt        |  866 ++++++++++++
 arch/arm/mach-s5pv210/mach-aquila.c                |   31 +
 arch/arm/mach-s5pv210/mach-goni.c                  |   31 +
 arch/arm/mach-s5pv310/mach-universal_c210.c        |   23 +
 drivers/misc/Kconfig                               |    8 +
 drivers/misc/Makefile                              |    1 +
 drivers/misc/cma-dev.c                             |  202 +++
 include/linux/cma.h                                |  479 +++++++
 include/linux/rbtree.h                             |   11 +
 include/linux/vcm-cma.h                            |   38 +
 include/linux/vcm-drv.h                            |  299 +++++
 include/linux/vcm-sample.h                         |   30 +
 include/linux/vcm.h                                |  275 ++++
 mm/Kconfig                                         |  123 ++
 mm/Makefile                                        |    5 +
 mm/cma-best-fit.c                                  |  407 ++++++
 mm/cma.c                                           | 1377 ++++++++++++++++++++
 mm/vcm-cma.c                                       |   84 ++
 mm/vcm-sample.c                                    |  120 ++
 mm/vcm.c                                           |  932 +++++++++++++
 tools/cma/cma-test.c                               |  386 ++++++
 25 files changed, 6415 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous
 create mode 100644 Documentation/contiguous-memory.txt
 create mode 100644 Documentation/virtual-contiguous-memory.txt
 create mode 100644 drivers/misc/cma-dev.c
 create mode 100644 include/linux/cma.h
 create mode 100644 include/linux/vcm-cma.h
 create mode 100644 include/linux/vcm-drv.h
 create mode 100644 include/linux/vcm-sample.h
 create mode 100644 include/linux/vcm.h
 create mode 100644 mm/cma-best-fit.c
 create mode 100644 mm/cma.c
 create mode 100644 mm/vcm-cma.c
 create mode 100644 mm/vcm-sample.c
 create mode 100644 mm/vcm.c
 create mode 100644 tools/cma/cma-test.c

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [RFCv5 0/9] CMA + VCMM integration
@ 2010-09-06  6:33 ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel

Hello everyone,

This patchset introduces a draft of a redesign of Zach Pfeffer's
VCMM.  Not all of the functionality of the original VCMM has been
ported into this patchset.  This is mostly meant as RFC.  Moreover,
the code for VCMM implementation in this RFC has not been tested.

CMA has not been changed compared to the previous CMA versions so no
aspects discussed on the list have been addressed yet.

The redesigned VCMM now uses notion of drivers -- a VCM context is
created for each MMU on the platform and each such context is handled
by a VCM driver.  A context (or may contexts) for One-to-One mappings
is created as well and handled with a One-to-One VCM driver.

The patchset introduces a sample (or a template if you will) VCM MMU
driver as well as a VCM CMA One-to-One driver so it is shown by
example how VCM drivers are written.

The VCMM framework proposed by this patchset also introduces
a vcm_make_binding() call which allocates physical memory, creates
virtual address reservation and binds the two together.  This makes
life easier for One-to-One mappings and if device drivers limit their
use of VCM API to a subset of functionality the can work on systems
with or without MMU with no modifications (only the VCM context would
need to change).

Please refer to documentation in the second and seventh patch for more
information regarding CMA and VCMM respectively.

Michal Nazarewicz (9):
  lib: rbtree: rb_root_init() function added
  mm: cma: Contiguous Memory Allocator added
  mm: cma: Added SysFS support
  mm: cma: Added command line parameters support
  mm: cma: Test device and application added
  ARM: cma: Added CMA to Aquila, Goni and c210 universal boards
  mm: vcm: Virtual Contiguous Memory framework added
  mm: vcm: Sample driver added
  mm: vcm: vcm-cma: VCM CMA driver added

 Documentation/00-INDEX                             |    4 +
 .../ABI/testing/sysfs-kernel-mm-contiguous         |   53 +
 Documentation/contiguous-memory.txt                |  623 +++++++++
 Documentation/kernel-parameters.txt                |    7 +
 Documentation/virtual-contiguous-memory.txt        |  866 ++++++++++++
 arch/arm/mach-s5pv210/mach-aquila.c                |   31 +
 arch/arm/mach-s5pv210/mach-goni.c                  |   31 +
 arch/arm/mach-s5pv310/mach-universal_c210.c        |   23 +
 drivers/misc/Kconfig                               |    8 +
 drivers/misc/Makefile                              |    1 +
 drivers/misc/cma-dev.c                             |  202 +++
 include/linux/cma.h                                |  479 +++++++
 include/linux/rbtree.h                             |   11 +
 include/linux/vcm-cma.h                            |   38 +
 include/linux/vcm-drv.h                            |  299 +++++
 include/linux/vcm-sample.h                         |   30 +
 include/linux/vcm.h                                |  275 ++++
 mm/Kconfig                                         |  123 ++
 mm/Makefile                                        |    5 +
 mm/cma-best-fit.c                                  |  407 ++++++
 mm/cma.c                                           | 1377 ++++++++++++++++++++
 mm/vcm-cma.c                                       |   84 ++
 mm/vcm-sample.c                                    |  120 ++
 mm/vcm.c                                           |  932 +++++++++++++
 tools/cma/cma-test.c                               |  386 ++++++
 25 files changed, 6415 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous
 create mode 100644 Documentation/contiguous-memory.txt
 create mode 100644 Documentation/virtual-contiguous-memory.txt
 create mode 100644 drivers/misc/cma-dev.c
 create mode 100644 include/linux/cma.h
 create mode 100644 include/linux/vcm-cma.h
 create mode 100644 include/linux/vcm-drv.h
 create mode 100644 include/linux/vcm-sample.h
 create mode 100644 include/linux/vcm.h
 create mode 100644 mm/cma-best-fit.c
 create mode 100644 mm/cma.c
 create mode 100644 mm/vcm-cma.c
 create mode 100644 mm/vcm-sample.c
 create mode 100644 mm/vcm.c
 create mode 100644 tools/cma/cma-test.c

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [RFCv5 1/9] lib: rbtree: rb_root_init() function added
  2010-09-06  6:33 ` Michal Nazarewicz
  (?)
@ 2010-09-06  6:33   ` Michal Nazarewicz
  -1 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

Added a rb_root_init() function which initialises a rb_root
structure as a red-black tree with at most one element.  The
rationale is that using rb_root_init(root, node) is more
straightforward and cleaner then first initialising and
empty tree followed by an insert operation.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 include/linux/rbtree.h |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index 7066acb..5b6dc66 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -130,6 +130,17 @@ static inline void rb_set_color(struct rb_node *rb, int color)
 }
 
 #define RB_ROOT	(struct rb_root) { NULL, }
+
+static inline void rb_root_init(struct rb_root *root, struct rb_node *node)
+{
+	root->rb_node = node;
+	if (node) {
+		node->rb_parent_color = RB_BLACK; /* black, no parent */
+		node->rb_left  = NULL;
+		node->rb_right = NULL;
+	}
+}
+
 #define	rb_entry(ptr, type, member) container_of(ptr, type, member)
 
 #define RB_EMPTY_ROOT(root)	((root)->rb_node == NULL)
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 1/9] lib: rbtree: rb_root_init() function added
@ 2010-09-06  6:33   ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

Added a rb_root_init() function which initialises a rb_root
structure as a red-black tree with at most one element.  The
rationale is that using rb_root_init(root, node) is more
straightforward and cleaner then first initialising and
empty tree followed by an insert operation.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 include/linux/rbtree.h |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index 7066acb..5b6dc66 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -130,6 +130,17 @@ static inline void rb_set_color(struct rb_node *rb, int color)
 }
 
 #define RB_ROOT	(struct rb_root) { NULL, }
+
+static inline void rb_root_init(struct rb_root *root, struct rb_node *node)
+{
+	root->rb_node = node;
+	if (node) {
+		node->rb_parent_color = RB_BLACK; /* black, no parent */
+		node->rb_left  = NULL;
+		node->rb_right = NULL;
+	}
+}
+
 #define	rb_entry(ptr, type, member) container_of(ptr, type, member)
 
 #define RB_EMPTY_ROOT(root)	((root)->rb_node == NULL)
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 1/9] lib: rbtree: rb_root_init() function added
@ 2010-09-06  6:33   ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel

Added a rb_root_init() function which initialises a rb_root
structure as a red-black tree with at most one element.  The
rationale is that using rb_root_init(root, node) is more
straightforward and cleaner then first initialising and
empty tree followed by an insert operation.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 include/linux/rbtree.h |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index 7066acb..5b6dc66 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -130,6 +130,17 @@ static inline void rb_set_color(struct rb_node *rb, int color)
 }
 
 #define RB_ROOT	(struct rb_root) { NULL, }
+
+static inline void rb_root_init(struct rb_root *root, struct rb_node *node)
+{
+	root->rb_node = node;
+	if (node) {
+		node->rb_parent_color = RB_BLACK; /* black, no parent */
+		node->rb_left  = NULL;
+		node->rb_right = NULL;
+	}
+}
+
 #define	rb_entry(ptr, type, member) container_of(ptr, type, member)
 
 #define RB_EMPTY_ROOT(root)	((root)->rb_node == NULL)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 2/9] mm: cma: Contiguous Memory Allocator added
  2010-09-06  6:33 ` Michal Nazarewicz
  (?)
@ 2010-09-06  6:33   ` Michal Nazarewicz
  -1 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

The Contiguous Memory Allocator framework is a set of APIs for
allocating physically contiguous chunks of memory.

Various chips require contiguous blocks of memory to operate.  Those
chips include devices such as cameras, hardware video decoders and
encoders, etc.

The code is highly modular and customisable to suit the needs of
various users.  Set of regions reserved for CMA can be configured
per-platform and it is easy to add custom allocator algorithms if one
has such need.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Pawel Osciak <p.osciak@samsung.com>
---
 Documentation/00-INDEX              |    2 +
 Documentation/contiguous-memory.txt |  544 +++++++++++++++++++++
 include/linux/cma.h                 |  434 +++++++++++++++++
 mm/Kconfig                          |   41 ++
 mm/Makefile                         |    2 +
 mm/cma-best-fit.c                   |  407 ++++++++++++++++
 mm/cma.c                            |  911 +++++++++++++++++++++++++++++++++++
 7 files changed, 2341 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/contiguous-memory.txt
 create mode 100644 include/linux/cma.h
 create mode 100644 mm/cma-best-fit.c
 create mode 100644 mm/cma.c

diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
index 8dfc670..f93e787 100644
--- a/Documentation/00-INDEX
+++ b/Documentation/00-INDEX
@@ -94,6 +94,8 @@ connector/
 	- docs on the netlink based userspace<->kernel space communication mod.
 console/
 	- documentation on Linux console drivers.
+contiguous-memory.txt
+	- documentation on physically-contiguous memory allocation framework.
 cpu-freq/
 	- info on CPU frequency and voltage scaling.
 cpu-hotplug.txt
diff --git a/Documentation/contiguous-memory.txt b/Documentation/contiguous-memory.txt
new file mode 100644
index 0000000..e470c6f
--- /dev/null
+++ b/Documentation/contiguous-memory.txt
@@ -0,0 +1,544 @@
+                                                             -*- org -*-
+
+* Contiguous Memory Allocator
+
+   The Contiguous Memory Allocator (CMA) is a framework, which allows
+   setting up a machine-specific configuration for physically-contiguous
+   memory management. Memory for devices is then allocated according
+   to that configuration.
+
+   The main role of the framework is not to allocate memory, but to
+   parse and manage memory configurations, as well as to act as an
+   in-between between device drivers and pluggable allocators. It is
+   thus not tied to any memory allocation method or strategy.
+
+** Why is it needed?
+
+    Various devices on embedded systems have no scatter-getter and/or
+    IO map support and as such require contiguous blocks of memory to
+    operate.  They include devices such as cameras, hardware video
+    decoders and encoders, etc.
+
+    Such devices often require big memory buffers (a full HD frame is,
+    for instance, more then 2 mega pixels large, i.e. more than 6 MB
+    of memory), which makes mechanisms such as kmalloc() ineffective.
+
+    Some embedded devices impose additional requirements on the
+    buffers, e.g. they can operate only on buffers allocated in
+    particular location/memory bank (if system has more than one
+    memory bank) or buffers aligned to a particular memory boundary.
+
+    Development of embedded devices have seen a big rise recently
+    (especially in the V4L area) and many such drivers include their
+    own memory allocation code. Most of them use bootmem-based methods.
+    CMA framework is an attempt to unify contiguous memory allocation
+    mechanisms and provide a simple API for device drivers, while
+    staying as customisable and modular as possible.
+
+** Design
+
+    The main design goal for the CMA was to provide a customisable and
+    modular framework, which could be configured to suit the needs of
+    individual systems.  Configuration specifies a list of memory
+    regions, which then are assigned to devices.  Memory regions can
+    be shared among many device drivers or assigned exclusively to
+    one.  This has been achieved in the following ways:
+
+    1. The core of the CMA does not handle allocation of memory and
+       management of free space.  Dedicated allocators are used for
+       that purpose.
+
+       This way, if the provided solution does not match demands
+       imposed on a given system, one can develop a new algorithm and
+       easily plug it into the CMA framework.
+
+       The presented solution includes an implementation of a best-fit
+       algorithm.
+
+    2. When requesting memory, devices have to introduce themselves.
+       This way CMA knows who the memory is allocated for.  This
+       allows for the system architect to specify which memory regions
+       each device should use.
+
+    3. Memory regions are grouped in various "types".  When device
+       requests a chunk of memory, it can specify what type of memory
+       it needs.  If no type is specified, "common" is assumed.
+
+       This makes it possible to configure the system in such a way,
+       that a single device may get memory from different memory
+       regions, depending on the "type" of memory it requested.  For
+       example, a video codec driver might want to allocate some
+       shared buffers from the first memory bank and the other from
+       the second to get the highest possible memory throughput.
+
+    4. For greater flexibility and extensibility, the framework allows
+       device drivers to register private regions of reserved memory
+       which then may be used only by them.
+
+       As an effect, if a driver would not use the rest of the CMA
+       interface, it can still use CMA allocators and other
+       mechanisms.
+
+       4a. Early in boot process, device drivers can also request the
+           CMA framework to a reserve a region of memory for them
+           which then will be used as a private region.
+
+           This way, drivers do not need to directly call bootmem,
+           memblock or similar early allocator but merely register an
+           early region and the framework will handle the rest
+           including choosing the right early allocator.
+
+** Use cases
+
+    Let's analyse some imaginary system that uses the CMA to see how
+    the framework can be used and configured.
+
+
+    We have a platform with a hardware video decoder and a camera each
+    needing 20 MiB of memory in the worst case.  Our system is written
+    in such a way though that the two devices are never used at the
+    same time and memory for them may be shared.  In such a system the
+    following configuration would be used in the platform
+    initialisation code:
+
+        static struct cma_region regions[] = {
+                { .name = "region", .size = 20 << 20 },
+                { }
+        }
+        static const char map[] __initconst = "video,camera=region";
+
+        cma_set_defaults(regions, map);
+
+    The regions array defines a single 20-MiB region named "region".
+    The map says that drivers named "video" and "camera" are to be
+    granted memory from the previously defined region.
+
+    A shorter map can be used as well:
+
+        static const char map[] __initconst = "*=region";
+
+    The asterisk ("*") matches all devices thus all devices will use
+    the region named "region".
+
+    We can see, that because the devices share the same memory region,
+    we save 20 MiB, compared to the situation when each of the devices
+    would reserve 20 MiB of memory for itself.
+
+
+    Now, let's say that we have also many other smaller devices and we
+    want them to share some smaller pool of memory.  For instance 5
+    MiB.  This can be achieved in the following way:
+
+        static struct cma_region regions[] = {
+                { .name = "region", .size = 20 << 20 },
+                { .name = "common", .size =  5 << 20 },
+                { }
+        }
+        static const char map[] __initconst =
+                "video,camera=region;*=common";
+
+        cma_set_defaults(regions, map);
+
+    This instructs CMA to reserve two regions and let video and camera
+    use region "region" whereas all other devices should use region
+    "common".
+
+
+    Later on, after some development of the system, it can now run
+    video decoder and camera at the same time.  The 20 MiB region is
+    no longer enough for the two to share.  A quick fix can be made to
+    grant each of those devices separate regions:
+
+        static struct cma_region regions[] = {
+                { .name = "v", .size = 20 << 20 },
+                { .name = "c", .size = 20 << 20 },
+                { .name = "common", .size =  5 << 20 },
+                { }
+        }
+        static const char map[] __initconst = "video=v;camera=c;*=common";
+
+        cma_set_defaults(regions, map);
+
+    This solution also shows how with CMA you can assign private pools
+    of memory to each device if that is required.
+
+
+    Allocation mechanisms can be replaced dynamically in a similar
+    manner as well. Let's say that during testing, it has been
+    discovered that, for a given shared region of 40 MiB,
+    fragmentation has become a problem.  It has been observed that,
+    after some time, it becomes impossible to allocate buffers of the
+    required sizes. So to satisfy our requirements, we would have to
+    reserve a larger shared region beforehand.
+
+    But fortunately, you have also managed to develop a new allocation
+    algorithm -- Neat Allocation Algorithm or "na" for short -- which
+    satisfies the needs for both devices even on a 30 MiB region.  The
+    configuration can be then quickly changed to:
+
+        static struct cma_region regions[] = {
+                { .name = "region", .size = 30 << 20, .alloc_name = "na" },
+                { .name = "common", .size =  5 << 20 },
+                { }
+        }
+        static const char map[] __initconst = "video,camera=region;*=common";
+
+        cma_set_defaults(regions, map);
+
+    This shows how you can develop your own allocation algorithms if
+    the ones provided with CMA do not suit your needs and easily
+    replace them, without the need to modify CMA core or even
+    recompiling the kernel.
+
+** Technical Details
+
+*** The attributes
+
+    As shown above, CMA is configured by a two attributes: list
+    regions and map.  The first one specifies regions that are to be
+    reserved for CMA.  The second one specifies what regions each
+    device is assigned to.
+
+**** Regions
+
+     Regions is a list of regions terminated by a region with size
+     equal zero.  The following fields may be set:
+
+     - size       -- size of the region (required, must not be zero)
+     - alignment  -- alignment of the region; must be power of two or
+                     zero (optional)
+     - start      -- where the region has to start (optional)
+     - alloc_name -- the name of allocator to use (optional)
+     - alloc      -- allocator to use (optional; and besides
+                     alloc_name is probably is what you want)
+
+     size, alignment and start is specified in bytes.  Size will be
+     aligned up to a PAGE_SIZE.  If alignment is less then a PAGE_SIZE
+     it will be set to a PAGE_SIZE.  start will be aligned to
+     alignment.
+
+**** Map
+
+     The format of the "map" attribute is as follows:
+
+         map-attr      ::= [ rules [ ';' ] ]
+         rules         ::= rule [ ';' rules ]
+         rule          ::= patterns '=' regions
+
+         patterns      ::= pattern [ ',' patterns ]
+
+         regions       ::= REG-NAME [ ',' regions ]
+                       // list of regions to try to allocate memory
+                       // from
+
+         pattern       ::= dev-pattern [ '/' TYPE-NAME ] | '/' TYPE-NAME
+                       // pattern request must match for the rule to
+                       // apply; the first rule that matches is
+                       // applied; if dev-pattern part is omitted
+                       // value identical to the one used in previous
+                       // pattern is assumed.
+
+         dev-pattern   ::= PATTERN
+                       // pattern that device name must match for the
+                       // rule to apply; may contain question marks
+                       // which mach any characters and end with an
+                       // asterisk which match the rest of the string
+                       // (including nothing).
+
+     It is a sequence of rules which specify what regions should given
+     (device, type) pair use.  The first rule that matches is applied.
+
+     For rule to match, the pattern must match (dev, type) pair.
+     Pattern consist of the part before and after slash.  The first
+     part must match device name and the second part must match kind.
+
+     If the first part is empty, the device name is assumed to match
+     iff it matched in previous pattern.  If the second part is
+     omitted it will mach any type of memory requested by device.
+
+     Some examples (whitespace added for better readability):
+
+         cma_map = foo/quaz = r1;
+                       // device foo with type == "quaz" uses region r1
+
+                   foo/* = r2;     // OR:
+                   /* = r2;
+                       // device foo with any other kind uses region r2
+
+                   bar = r1,r2;
+                       // device bar uses region r1 or r2
+
+                   baz?/a , baz?/b = r3;
+                       // devices named baz? where ? is any character
+                       // with type being "a" or "b" use r3
+
+*** The device and types of memory
+
+    The name of the device is taken from the device structure.  It is
+    not possible to use CMA if driver does not register a device
+    (actually this can be overcome if a fake device structure is
+    provided with at least the name set).
+
+    The type of memory is an optional argument provided by the device
+    whenever it requests memory chunk.  In many cases this can be
+    ignored but sometimes it may be required for some devices.
+
+    For instance, let's say that there are two memory banks and for
+    performance reasons a device uses buffers in both of them.
+    Platform defines a memory types "a" and "b" for regions in both
+    banks.  The device driver would use those two types then to
+    request memory chunks from different banks.  CMA attributes could
+    look as follows:
+
+         static struct cma_region regions[] = {
+                 { .name = "a", .size = 32 << 20 },
+                 { .name = "b", .size = 32 << 20, .start = 512 << 20 },
+                 { }
+         }
+         static const char map[] __initconst = "foo/a=a;foo/b=b;*=a,b";
+
+    And whenever the driver allocated the memory it would specify the
+    kind of memory:
+
+        buffer1 = cma_alloc(dev, "a", 1 << 20, 0);
+        buffer2 = cma_alloc(dev, "b", 1 << 20, 0);
+
+    If it was needed to try to allocate from the other bank as well if
+    the dedicated one is full, the map attributes could be changed to:
+
+         static const char map[] __initconst = "foo/a=a,b;foo/b=b,a;*=a,b";
+
+    On the other hand, if the same driver was used on a system with
+    only one bank, the configuration could be changed just to:
+
+         static struct cma_region regions[] = {
+                 { .name = "r", .size = 64 << 20 },
+                 { }
+         }
+         static const char map[] __initconst = "*=r";
+
+    without the need to change the driver at all.
+
+*** Device API
+
+    There are three basic calls provided by the CMA framework to
+    devices.  To allocate a chunk of memory cma_alloc() function needs
+    to be used:
+
+        dma_addr_t cma_alloc(const struct device *dev, const char *type,
+                             size_t size, dma_addr_t alignment);
+
+    If required, device may specify alignment in bytes that the chunk
+    need to satisfy.  It have to be a power of two or zero.  The
+    chunks are always aligned at least to a page.
+
+    The type specifies the type of memory as described to in the
+    previous subsection.  If device driver does not care about memory
+    type it can safely pass NULL as the type which is the same as
+    possing "common".
+
+    The basic usage of the function is just a:
+
+        addr = cma_alloc(dev, NULL, size, 0);
+
+    The function returns bus address of allocated chunk or a value
+    that evaluates to true if checked with IS_ERR_VALUE(), so the
+    correct way for checking for errors is:
+
+        unsigned long addr = cma_alloc(dev, NULL, size, 0);
+        if (IS_ERR_VALUE(addr))
+                /* Error */
+                return (int)addr;
+        /* Allocated */
+
+    (Make sure to include <linux/err.h> which contains the definition
+    of the IS_ERR_VALUE() macro.)
+
+
+    Allocated chunk is freed via a cma_free() function:
+
+        int cma_free(dma_addr_t addr);
+
+    It takes bus address of the chunk as an argument frees it.
+
+
+    The last function is the cma_info() which returns information
+    about regions assigned to given (dev, type) pair.  Its syntax is:
+
+        int cma_info(struct cma_info *info,
+                     const struct device *dev,
+                     const char *type);
+
+    On successful exit it fills the info structure with lower and
+    upper bound of regions, total size and number of regions assigned
+    to given (dev, type) pair.
+
+**** Dynamic and private regions
+
+     In the basic setup, regions are provided and initialised by
+     platform initialisation code (which usually use
+     cma_set_defaults() for that purpose).
+
+     It is, however, possible to create and add regions dynamically
+     using cma_region_register() function.
+
+         int cma_region_register(struct cma_region *reg);
+
+     The region does not have to have name.  If it does not, it won't
+     be accessed via standard mapping (the one provided with map
+     attribute).  Such regions are private and to allocate chunk from
+     them, one needs to call:
+
+         dma_addr_t cma_alloc_from_region(struct cma_region *reg,
+                                          size_t size, dma_addr_t alignment);
+
+     It is just like cma_alloc() expect one specifies what region to
+     allocate memory from.  The region must have been registered.
+
+**** Allocating from region specified by name
+
+     If a driver preferred allocating from a region or list of regions
+     it knows name of it can use a different call simmilar to the
+     previous:
+
+         dma_addr_t cma_alloc_from(const char *regions,
+                                   size_t size, dma_addr_t alignment);
+
+     The first argument is a comma-separated list of regions the
+     driver desires CMA to try and allocate from.  The list is
+     terminated by a NUL byte or a semicolon.
+
+     Similarly, there is a call for requesting information about named
+     regions:
+
+        int cma_info_about(struct cma_info *info, const char *regions);
+
+     Generally, it should not be needed to use those interfaces but
+     they are provided nevertheless.
+
+**** Registering early regions
+
+     An early region is a region that is managed by CMA early during
+     boot process.  It's platforms responsibility to reserve memory
+     for early regions.  Later on, when CMA initialises, early regions
+     with reserved memory are registered as normal regions.
+     Registering an early region may be a way for a device to request
+     a private pool of memory without worrying about actually
+     reserving the memory:
+
+         int cma_early_region_register(struct cma_region *reg);
+
+     This needs to be done quite early on in boot process, before
+     platform traverses the cma_early_regions list to reserve memory.
+
+     When boot process ends, device driver may see whether the region
+     was reserved (by checking reg->reserved flag) and if so, whether
+     it was successfully registered as a normal region (by checking
+     the reg->registered flag).  If that is the case, device driver
+     can use normal API calls to use the region.
+
+*** Allocator operations
+
+    Creating an allocator for CMA needs four functions to be
+    implemented.
+
+
+    The first two are used to initialise an allocator for given driver
+    and clean up afterwards:
+
+        int  cma_foo_init(struct cma_region *reg);
+        void cma_foo_cleanup(struct cma_region *reg);
+
+    The first is called when allocator is attached to region.  When
+    the function is called, the cma_region structure is fully
+    initialised (ie. starting address and size have correct values).
+    As a meter of fact, allocator should never modify the cma_region
+    structure other then the private_data field which it may use to
+    point to it's private data.
+
+    The second call cleans up and frees all resources the allocator
+    has allocated for the region.  The function can assume that all
+    chunks allocated form this region have been freed thus the whole
+    region is free.
+
+
+    The two other calls are used for allocating and freeing chunks.
+    They are:
+
+        struct cma_chunk *cma_foo_alloc(struct cma_region *reg,
+                                        size_t size, dma_addr_t alignment);
+        void cma_foo_free(struct cma_chunk *chunk);
+
+    As names imply the first allocates a chunk and the other frees
+    a chunk of memory.  It also manages a cma_chunk object
+    representing the chunk in physical memory.
+
+    Either of those function can assume that they are the only thread
+    accessing the region.  Therefore, allocator does not need to worry
+    about concurrency.  Moreover, all arguments are guaranteed to be
+    valid (i.e. page aligned size, a power of two alignment no lower
+    the a page size).
+
+
+    When allocator is ready, all that is left is to register it by
+    calling cma_allocator_register() function:
+
+            int cma_allocator_register(struct cma_allocator *alloc);
+
+    The argument is an structure with pointers to the above functions
+    and allocator's name.  The whole call may look something like
+    this:
+
+        static struct cma_allocator alloc = {
+                .name    = "foo",
+                .init    = cma_foo_init,
+                .cleanup = cma_foo_cleanup,
+                .alloc   = cma_foo_alloc,
+                .free    = cma_foo_free,
+        };
+        return cma_allocator_register(&alloc);
+
+    The name ("foo") will be used when a this particular allocator is
+    requested as an allocator for given region.
+
+*** Integration with platform
+
+    There is one function that needs to be called form platform
+    initialisation code.  That is the cma_early_regions_reserve()
+    function:
+
+        void cma_early_regions_reserve(int (*reserve)(struct cma_region *reg));
+
+    It traverses list of all of the early regions provided by platform
+    and registered by drivers and reserves memory for them.  The only
+    argument is a callback function used to reserve the region.
+    Passing NULL as the argument is the same as passing
+    cma_early_region_reserve() function which uses bootmem and
+    memblock for allocating.
+
+    Alternatively, platform code could traverse the cma_early_regions
+    list by itself but this should never be necessary.
+
+
+    Platform has also a way of providing default attributes for CMA,
+    cma_set_defaults() function is used for that purpose:
+
+        int cma_set_defaults(struct cma_region *regions, const char *map)
+
+    It needs to be called prior to reserving regions.  It let one
+    specify the list of regions defined by platform and the map
+    attribute.  The map may point to a string in __initdata.  See
+    above in this document for example usage of this function.
+
+** Future work
+
+    In the future, implementation of mechanisms that would allow the
+    free space inside the regions to be used as page cache, filesystem
+    buffers or swap devices is planned.  With such mechanisms, the
+    memory would not be wasted when not used.
+
+    Because all allocations and freeing of chunks pass the CMA
+    framework it can follow what parts of the reserved memory are
+    freed and what parts are allocated.  Tracking the unused memory
+    would let CMA use it for other purposes such as page cache, I/O
+    buffers, swap, etc.
diff --git a/include/linux/cma.h b/include/linux/cma.h
new file mode 100644
index 0000000..f6f9cb5
--- /dev/null
+++ b/include/linux/cma.h
@@ -0,0 +1,434 @@
+#ifndef __LINUX_CMA_H
+#define __LINUX_CMA_H
+
+/*
+ * Contiguous Memory Allocator framework
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ */
+
+/*
+ * See Documentation/contiguous-memory.txt for details.
+ */
+
+/***************************** Kernel level API *****************************/
+
+#ifdef __KERNEL__
+
+#include <linux/rbtree.h>
+#include <linux/list.h>
+#include <linux/init.h>
+
+
+struct device;
+struct cma_info;
+
+/*
+ * Don't call it directly, use cma_alloc(), cma_alloc_from() or
+ * cma_alloc_from_region().
+ */
+dma_addr_t __must_check
+__cma_alloc(const struct device *dev, const char *type,
+	    size_t size, dma_addr_t alignment);
+
+/* Don't call it directly, use cma_info() or cma_info_about(). */
+int
+__cma_info(struct cma_info *info, const struct device *dev, const char *type);
+
+
+/**
+ * cma_alloc - allocates contiguous chunk of memory.
+ * @dev:	The device to perform allocation for.
+ * @type:	A type of memory to allocate.  Platform may define
+ *		several different types of memory and device drivers
+ *		can then request chunks of different types.  Usually it's
+ *		safe to pass NULL here which is the same as passing
+ *		"common".
+ * @size:	Size of the memory to allocate in bytes.
+ * @alignment:	Desired alignment in bytes.  Must be a power of two or
+ *		zero.  If alignment is less then a page size it will be
+ *		set to page size. If unsure, pass zero here.
+ *
+ * On error returns a negative error cast to dma_addr_t.  Use
+ * IS_ERR_VALUE() to check if returned value is indeed an error.
+ * Otherwise bus address of the chunk is returned.
+ */
+static inline dma_addr_t __must_check
+cma_alloc(const struct device *dev, const char *type,
+	  size_t size, dma_addr_t alignment)
+{
+	return dev ? __cma_alloc(dev, type, size, alignment) : -EINVAL;
+}
+
+
+/**
+ * struct cma_info - information about regions returned by cma_info().
+ * @lower_bound:	The smallest address that is possible to be
+ *			allocated for given (dev, type) pair.
+ * @upper_bound:	The one byte after the biggest address that is
+ *			possible to be allocated for given (dev, type)
+ *			pair.
+ * @total_size:	Total size of regions mapped to (dev, type) pair.
+ * @free_size:	Total free size in all of the regions mapped to (dev, type)
+ *		pair.  Because of possible race conditions, it is not
+ *		guaranteed that the value will be correct -- it gives only
+ *		an approximation.
+ * @count:	Number of regions mapped to (dev, type) pair.
+ */
+struct cma_info {
+	dma_addr_t lower_bound, upper_bound;
+	size_t total_size, free_size;
+	unsigned count;
+};
+
+/**
+ * cma_info - queries information about regions.
+ * @info:	Pointer to a structure where to save the information.
+ * @dev:	The device to query information for.
+ * @type:	A type of memory to query information for.
+ *		If unsure, pass NULL here which is equal to passing
+ *		"common".
+ *
+ * On error returns a negative error, zero otherwise.
+ */
+static inline int
+cma_info(struct cma_info *info, const struct device *dev, const char *type)
+{
+	return dev ? __cma_info(info, dev, type) : -EINVAL;
+}
+
+
+/**
+ * cma_free - frees a chunk of memory.
+ * @addr:	Beginning of the chunk.
+ *
+ * Returns -ENOENT if there is no chunk at given location; otherwise
+ * zero.  In the former case issues a warning.
+ */
+int cma_free(dma_addr_t addr);
+
+
+
+/****************************** Lower lever API *****************************/
+
+/**
+ * cma_alloc_from - allocates contiguous chunk of memory from named regions.
+ * @regions:	Comma separated list of region names.  Terminated by NUL
+ *		byte or a semicolon.
+ * @size:	Size of the memory to allocate in bytes.
+ * @alignment:	Desired alignment in bytes.  Must be a power of two or
+ *		zero.  If alignment is less then a page size it will be
+ *		set to page size. If unsure, pass zero here.
+ *
+ * On error returns a negative error cast to dma_addr_t.  Use
+ * IS_ERR_VALUE() to check if returned value is indeed an error.
+ * Otherwise bus address of the chunk is returned.
+ */
+static inline dma_addr_t __must_check
+cma_alloc_from(const char *regions, size_t size, dma_addr_t alignment)
+{
+	return __cma_alloc(NULL, regions, size, alignment);
+}
+
+/**
+ * cma_info_about - queries information about named regions.
+ * @info:	Pointer to a structure where to save the information.
+ * @regions:	Comma separated list of region names.  Terminated by NUL
+ *		byte or a semicolon.
+ *
+ * On error returns a negative error, zero otherwise.
+ */
+static inline int
+cma_info_about(struct cma_info *info, const const char *regions)
+{
+	return __cma_info(info, NULL, regions);
+}
+
+
+
+struct cma_allocator;
+
+/**
+ * struct cma_region - a region reserved for CMA allocations.
+ * @name:	Unique name of the region.  Read only.
+ * @start:	Bus address of the region in bytes.  Always aligned at
+ *		least to a full page.  Read only.
+ * @size:	Size of the region in bytes.  Multiply of a page size.
+ *		Read only.
+ * @free_space:	Free space in the region.  Read only.
+ * @alignment:	Desired alignment of the region in bytes.  A power of two,
+ *		always at least page size.  Early.
+ * @alloc:	Allocator used with this region.  NULL means allocator is
+ *		not attached.  Private.
+ * @alloc_name:	Allocator name read from cmdline.  Private.  This may be
+ *		different from @alloc->name.
+ * @private_data:	Allocator's private data.
+ * @users:	Number of chunks allocated in this region.
+ * @list:	Entry in list of regions.  Private.
+ * @used:	Whether region was already used, ie. there was at least
+ *		one allocation request for.  Private.
+ * @registered:	Whether this region has been registered.  Read only.
+ * @reserved:	Whether this region has been reserved.  Early.  Read only.
+ * @copy_name:	Whether @name and @alloc_name needs to be copied when
+ *		this region is converted from early to normal.  Early.
+ *		Private.
+ * @free_alloc_name:	Whether @alloc_name was kmalloced().  Private.
+ *
+ * Regions come in two types: an early region and normal region.  The
+ * former can be reserved or not-reserved.  Fields marked as "early"
+ * are only meaningful in early regions.
+ *
+ * Early regions are important only during initialisation.  The list
+ * of early regions is built from the "cma" command line argument or
+ * platform defaults.  Platform initialisation code is responsible for
+ * reserving space for unreserved regions that are placed on
+ * cma_early_regions list.
+ *
+ * Later, during CMA initialisation all reserved regions from the
+ * cma_early_regions list are registered as normal regions and can be
+ * used using standard mechanisms.
+ */
+struct cma_region {
+	const char *name;
+	dma_addr_t start;
+	size_t size;
+	union {
+		size_t free_space;	/* Normal region */
+		dma_addr_t alignment;	/* Early region */
+	};
+
+	struct cma_allocator *alloc;
+	const char *alloc_name;
+	void *private_data;
+
+	unsigned users;
+	struct list_head list;
+
+	unsigned used:1;
+	unsigned registered:1;
+	unsigned reserved:1;
+	unsigned copy_name:1;
+	unsigned free_alloc_name:1;
+};
+
+
+/**
+ * cma_region_register() - registers a region.
+ * @reg:	Region to region.
+ *
+ * Region's start and size must be set.
+ *
+ * If name is set the region will be accessible using normal mechanism
+ * like mapping or cma_alloc_from() function otherwise it will be
+ * a private region and accessible only using the
+ * cma_alloc_from_region() function.
+ *
+ * If alloc is set function will try to initialise given allocator
+ * (and will return error if it failes).  Otherwise alloc_name may
+ * point to a name of an allocator to use (if not set, the default
+ * will be used).
+ *
+ * All other fields are ignored and/or overwritten.
+ *
+ * Returns zero or negative error.  In particular, -EADDRINUSE if
+ * region overlap with already existing region.
+ */
+int __must_check cma_region_register(struct cma_region *reg);
+
+/**
+ * cma_region_unregister() - unregisters a region.
+ * @reg:	Region to unregister.
+ *
+ * Region is unregistered only if there are no chunks allocated for
+ * it.  Otherwise, function returns -EBUSY.
+ *
+ * On success returs zero.
+ */
+int __must_check cma_region_unregister(struct cma_region *reg);
+
+
+/**
+ * cma_alloc_from_region() - allocates contiguous chunk of memory from region.
+ * @reg:	Region to allocate chunk from.
+ * @size:	Size of the memory to allocate in bytes.
+ * @alignment:	Desired alignment in bytes.  Must be a power of two or
+ *		zero.  If alignment is less then a page size it will be
+ *		set to page size. If unsure, pass zero here.
+ *
+ * On error returns a negative error cast to dma_addr_t.  Use
+ * IS_ERR_VALUE() to check if returned value is indeed an error.
+ * Otherwise bus address of the chunk is returned.
+ */
+dma_addr_t __must_check
+cma_alloc_from_region(struct cma_region *reg,
+		      size_t size, dma_addr_t alignment);
+
+
+
+/****************************** Allocators API ******************************/
+
+/**
+ * struct cma_chunk - an allocated contiguous chunk of memory.
+ * @start:	Bus address in bytes.
+ * @size:	Size in bytes.
+ * @free_space:	Free space in region in bytes.  Read only.
+ * @reg:	Region this chunk belongs to.
+ * @by_start:	A node in an red-black tree with all chunks sorted by
+ *		start address.
+ *
+ * The cma_allocator::alloc() operation need to set only the @start
+ * and @size fields.  The rest is handled by the caller (ie. CMA
+ * glue).
+ */
+struct cma_chunk {
+	dma_addr_t start;
+	size_t size;
+
+	struct cma_region *reg;
+	struct rb_node by_start;
+};
+
+
+/**
+ * struct cma_allocator - a CMA allocator.
+ * @name:	Allocator's unique name
+ * @init:	Initialises an allocator on given region.
+ * @cleanup:	Cleans up after init.  May assume that there are no chunks
+ *		allocated in given region.
+ * @alloc:	Allocates a chunk of memory of given size in bytes and
+ *		with given alignment.  Alignment is a power of
+ *		two (thus non-zero) and callback does not need to check it.
+ *		May also assume that it is the only call that uses given
+ *		region (ie. access to the region is synchronised with
+ *		a mutex).  This has to allocate the chunk object (it may be
+ *		contained in a bigger structure with allocator-specific data.
+ *		Required.
+ * @free:	Frees allocated chunk.  May also assume that it is the only
+ *		call that uses given region.  This has to free() the chunk
+ *		object as well.  Required.
+ * @list:	Entry in list of allocators.  Private.
+ */
+struct cma_allocator {
+	const char *name;
+
+	int (*init)(struct cma_region *reg);
+	void (*cleanup)(struct cma_region *reg);
+	struct cma_chunk *(*alloc)(struct cma_region *reg, size_t size,
+				   dma_addr_t alignment);
+	void (*free)(struct cma_chunk *chunk);
+
+	struct list_head list;
+};
+
+
+/**
+ * cma_allocator_register() - Registers an allocator.
+ * @alloc:	Allocator to register.
+ *
+ * Adds allocator to the list of allocators managed by CMA.
+ *
+ * All of the fields of cma_allocator structure must be set except for
+ * the optional name and the list's head which will be overriden
+ * anyway.
+ *
+ * Returns zero or negative error code.
+ */
+int cma_allocator_register(struct cma_allocator *alloc);
+
+
+/**************************** Initialisation API ****************************/
+
+/**
+ * cma_set_defaults() - specifies default command line parameters.
+ * @regions:	A zero-sized entry terminated list of early regions.
+ *		This array must not be placed in __initdata section.
+ * @map:	Map attribute.
+ *
+ * This function should be called prior to cma_early_regions_reserve()
+ * and after early parameters have been parsed.
+ *
+ * Returns zero or negative error.
+ */
+int __init cma_set_defaults(struct cma_region *regions, const char *map);
+
+
+/**
+ * cma_early_regions - a list of early regions.
+ *
+ * Platform needs to allocate space for each of the region before
+ * initcalls are executed.  If space is reserved, the reserved flag
+ * must be set.  Platform initialisation code may choose to use
+ * cma_early_regions_allocate().
+ *
+ * Later, during CMA initialisation all reserved regions from the
+ * cma_early_regions list are registered as normal regions and can be
+ * used using standard mechanisms.
+ */
+extern struct list_head cma_early_regions __initdata;
+
+
+/**
+ * cma_early_region_register() - registers an early region.
+ * @reg:	Region to add.
+ *
+ * Region's size, start and alignment must be set (however the last
+ * two can be zero).  If name is set the region will be accessible
+ * using normal mechanism like mapping or cma_alloc_from() function
+ * otherwise it will be a private region accessible only using the
+ * cma_alloc_from_region().
+ *
+ * During platform initialisation, space is reserved for early
+ * regions.  Later, when CMA initialises, the early regions are
+ * "converted" into normal regions.  If cma_region::alloc is set, CMA
+ * will then try to setup given allocator on the region.  Failure to
+ * do so will result in the region not being registered even though
+ * the space for it will still be reserved.  If cma_region::alloc is
+ * not set, allocator will be attached to the region on first use and
+ * the value of cma_region::alloc_name will be taken into account if
+ * set.
+ *
+ * All other fields are ignored and/or overwritten.
+ *
+ * Returns zero or negative error.  No checking if regions overlap is
+ * performed.
+ */
+int __init __must_check cma_early_region_register(struct cma_region *reg);
+
+
+/**
+ * cma_early_region_reserve() - reserves a physically contiguous memory region.
+ * @reg:	Early region to reserve memory for.
+ *
+ * If platform supports bootmem this is the first allocator this
+ * function tries to use.  If that failes (or bootmem is not
+ * supported) function tries to use memblec if it is available.
+ *
+ * On success sets reg->reserved flag.
+ *
+ * Returns zero or negative error.
+ */
+int __init cma_early_region_reserve(struct cma_region *reg);
+
+/**
+ * cma_early_regions_reserve() - helper function for reserving early regions.
+ * @reserve:	Callbac function used to reserve space for region.  Needs
+ *		to return non-negative if allocation succeeded, negative
+ *		error otherwise.  NULL means cma_early_region_alloc() will
+ *		be used.
+ *
+ * This function traverses the %cma_early_regions list and tries to
+ * reserve memory for each early region.  It uses the @reserve
+ * callback function for that purpose.  The reserved flag of each
+ * region is updated accordingly.
+ */
+void __init cma_early_regions_reserve(int (*reserve)(struct cma_region *reg));
+
+#else
+
+#define cma_set_defaults(regions, map)     ((int)0)
+#define cma_early_region_reserve(region)   ((int)-EOPNOTSUPP)
+#define cma_early_regions_reserve(reserve) do { } while (0)
+
+#endif
+
+#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index f4e516e..86043a3 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -301,3 +301,44 @@ config NOMMU_INITIAL_TRIM_EXCESS
 	  of 1 says that all excess pages should be trimmed.
 
 	  See Documentation/nommu-mmap.txt for more information.
+
+
+config CMA
+	bool "Contiguous Memory Allocator framework"
+	# Currently there is only one allocator so force it on
+	select CMA_BEST_FIT
+	help
+	  This enables the Contiguous Memory Allocator framework which
+	  allows drivers to allocate big physically-contiguous blocks of
+	  memory for use with hardware components that do not support I/O
+	  map nor scatter-gather.
+
+	  If you select this option you will also have to select at least
+	  one allocator algorithm below.
+
+	  To make use of CMA you need to specify the regions and
+	  driver->region mapping on command line when booting the kernel.
+
+	  For more information see <Documentation/contiguous-memory.txt>.
+	  If unsure, say "n".
+
+config CMA_DEBUG
+	bool "CMA debug messages (DEVELOPEMENT)"
+	depends on CMA
+	help
+	  Turns on debug messages in CMA.  This produces KERN_DEBUG
+	  messages for every CMA call as well as various messages while
+	  processing calls such as cma_alloc().  This option does not
+	  affect warning and error messages.
+
+	  This is mostly used during development.  If unsure, say "n".
+
+config CMA_BEST_FIT
+	bool "CMA best-fit allocator"
+	depends on CMA
+	help
+	  This is a best-fit algorithm running in O(n log n) time where
+	  n is the number of existing holes (which is never greater then
+	  the number of allocated regions and usually much smaller).  It
+	  allocates area from the smallest hole that is big enough for
+	  allocation in question.
diff --git a/mm/Makefile b/mm/Makefile
index 34b2546..d8c717f 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -47,3 +47,5 @@ obj-$(CONFIG_MEMORY_FAILURE) += memory-failure.o
 obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o
 obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
 obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
+obj-$(CONFIG_CMA) += cma.o
+obj-$(CONFIG_CMA_BEST_FIT) += cma-best-fit.o
diff --git a/mm/cma-best-fit.c b/mm/cma-best-fit.c
new file mode 100644
index 0000000..97f8d61
--- /dev/null
+++ b/mm/cma-best-fit.c
@@ -0,0 +1,407 @@
+/*
+ * Contiguous Memory Allocator framework: Best Fit allocator
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+#define pr_fmt(fmt) "cma: bf: " fmt
+
+#ifdef CONFIG_CMA_DEBUG
+#  define DEBUG
+#endif
+
+#include <linux/errno.h>       /* Error numbers */
+#include <linux/slab.h>        /* kmalloc() */
+
+#include <linux/cma.h>         /* CMA structures */
+
+
+/************************* Data Types *************************/
+
+struct cma_bf_item {
+	struct cma_chunk ch;
+	struct rb_node by_size;
+};
+
+struct cma_bf_private {
+	struct rb_root by_start_root;
+	struct rb_root by_size_root;
+};
+
+
+/************************* Prototypes *************************/
+
+/*
+ * Those are only for holes.  They must be called whenever hole's
+ * properties change but also whenever chunk becomes a hole or hole
+ * becames a chunk.
+ */
+static void __cma_bf_hole_insert_by_size(struct cma_bf_item *item);
+static void __cma_bf_hole_erase_by_size(struct cma_bf_item *item);
+static int  __must_check
+__cma_bf_hole_insert_by_start(struct cma_bf_item *item);
+static void __cma_bf_hole_erase_by_start(struct cma_bf_item *item);
+
+/**
+ * __cma_bf_hole_take - takes a chunk of memory out of a hole.
+ * @hole:	hole to take chunk from
+ * @size:	chunk's size
+ * @alignment:	chunk's starting address alignment (must be power of two)
+ *
+ * Takes a @size bytes large chunk from hole @hole which must be able
+ * to hold the chunk.  The "must be able" includes also alignment
+ * constraint.
+ *
+ * Returns allocated item or NULL on error (if kmalloc() failed).
+ */
+static struct cma_bf_item *__must_check
+__cma_bf_hole_take(struct cma_bf_item *hole, size_t size, dma_addr_t alignment);
+
+/**
+ * __cma_bf_hole_merge_maybe - tries to merge hole with neighbours.
+ * @item: hole to try and merge
+ *
+ * Which items are preserved is undefined so you may not rely on it.
+ */
+static void __cma_bf_hole_merge_maybe(struct cma_bf_item *item);
+
+
+/************************* Device API *************************/
+
+int cma_bf_init(struct cma_region *reg)
+{
+	struct cma_bf_private *prv;
+	struct cma_bf_item *item;
+
+	prv = kzalloc(sizeof *prv, GFP_KERNEL);
+	if (unlikely(!prv))
+		return -ENOMEM;
+
+	item = kzalloc(sizeof *item, GFP_KERNEL);
+	if (unlikely(!item)) {
+		kfree(prv);
+		return -ENOMEM;
+	}
+
+	item->ch.start = reg->start;
+	item->ch.size  = reg->size;
+	item->ch.reg   = reg;
+
+	rb_root_init(&prv->by_start_root, &item->ch.by_start);
+	rb_root_init(&prv->by_size_root, &item->by_size);
+
+	reg->private_data = prv;
+	return 0;
+}
+
+void cma_bf_cleanup(struct cma_region *reg)
+{
+	struct cma_bf_private *prv = reg->private_data;
+	struct cma_bf_item *item =
+		rb_entry(prv->by_size_root.rb_node,
+			 struct cma_bf_item, by_size);
+
+	/* We can assume there is only a single hole in the tree. */
+	WARN_ON(item->by_size.rb_left || item->by_size.rb_right ||
+		item->ch.by_start.rb_left || item->ch.by_start.rb_right);
+
+	kfree(item);
+	kfree(prv);
+}
+
+struct cma_chunk *cma_bf_alloc(struct cma_region *reg,
+			       size_t size, dma_addr_t alignment)
+{
+	struct cma_bf_private *prv = reg->private_data;
+	struct rb_node *node = prv->by_size_root.rb_node;
+	struct cma_bf_item *item = NULL;
+
+	/* First find hole that is large enough */
+	while (node) {
+		struct cma_bf_item *i =
+			rb_entry(node, struct cma_bf_item, by_size);
+
+		if (i->ch.size < size) {
+			node = node->rb_right;
+		} else if (i->ch.size >= size) {
+			node = node->rb_left;
+			item = i;
+		}
+	}
+	if (!item)
+		return NULL;
+
+	/* Now look for items which can satisfy alignment requirements */
+	for (;;) {
+		dma_addr_t start = ALIGN(item->ch.start, alignment);
+		dma_addr_t end   = item->ch.start + item->ch.size;
+		if (start < end && end - start >= size) {
+			item = __cma_bf_hole_take(item, size, alignment);
+			return likely(item) ? &item->ch : NULL;
+		}
+
+		node = rb_next(node);
+		if (!node)
+			return NULL;
+
+		item  = rb_entry(node, struct cma_bf_item, by_size);
+	}
+}
+
+void cma_bf_free(struct cma_chunk *chunk)
+{
+	struct cma_bf_item *item = container_of(chunk, struct cma_bf_item, ch);
+
+	/* Add new hole */
+	if (unlikely(__cma_bf_hole_insert_by_start(item))) {
+		/*
+		 * We're screwed...  Just free the item and forget
+		 * about it.  Things are broken beyond repair so no
+		 * sense in trying to recover.
+		 */
+		kfree(item);
+	} else {
+		__cma_bf_hole_insert_by_size(item);
+
+		/* Merge with prev and next sibling */
+		__cma_bf_hole_merge_maybe(item);
+	}
+}
+
+
+/************************* Basic Tree Manipulation *************************/
+
+static void __cma_bf_hole_insert_by_size(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	struct rb_node **link = &prv->by_size_root.rb_node, *parent = NULL;
+	const typeof(item->ch.size) value = item->ch.size;
+
+	while (*link) {
+		struct cma_bf_item *i;
+		parent = *link;
+		i = rb_entry(parent, struct cma_bf_item, by_size);
+		link = value <= i->ch.size
+			? &parent->rb_left
+			: &parent->rb_right;
+	}
+
+	rb_link_node(&item->by_size, parent, link);
+	rb_insert_color(&item->by_size, &prv->by_size_root);
+}
+
+static void __cma_bf_hole_erase_by_size(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	rb_erase(&item->by_size, &prv->by_size_root);
+}
+
+static int  __must_check
+__cma_bf_hole_insert_by_start(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	struct rb_node **link = &prv->by_start_root.rb_node, *parent = NULL;
+	const typeof(item->ch.start) value = item->ch.start;
+
+	while (*link) {
+		struct cma_bf_item *i;
+		parent = *link;
+		i = rb_entry(parent, struct cma_bf_item, ch.by_start);
+
+		if (WARN_ON(value == i->ch.start))
+			/*
+			 * This should *never* happen.  And I mean
+			 * *never*.  We could even BUG on it but
+			 * hopefully things are only a bit broken,
+			 * ie. system can still run.  We produce
+			 * a warning and return an error.
+			 */
+			return -EBUSY;
+
+		link = value <= i->ch.start
+			? &parent->rb_left
+			: &parent->rb_right;
+	}
+
+	rb_link_node(&item->ch.by_start, parent, link);
+	rb_insert_color(&item->ch.by_start, &prv->by_start_root);
+	return 0;
+}
+
+static void __cma_bf_hole_erase_by_start(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	rb_erase(&item->ch.by_start, &prv->by_start_root);
+}
+
+
+/************************* More Tree Manipulation *************************/
+
+static struct cma_bf_item *__must_check
+__cma_bf_hole_take(struct cma_bf_item *hole, size_t size, size_t alignment)
+{
+	struct cma_bf_item *item;
+
+	/*
+	 * There are three cases:
+	 * 1. the chunk takes the whole hole,
+	 * 2. the chunk is at the beginning or at the end of the hole, or
+	 * 3. the chunk is in the middle of the hole.
+	 */
+
+
+	/* Case 1, the whole hole */
+	if (size == hole->ch.size) {
+		__cma_bf_hole_erase_by_size(hole);
+		__cma_bf_hole_erase_by_start(hole);
+		return hole;
+	}
+
+
+	/* Allocate */
+	item = kmalloc(sizeof *item, GFP_KERNEL);
+	if (unlikely(!item))
+		return NULL;
+
+	item->ch.start = ALIGN(hole->ch.start, alignment);
+	item->ch.size  = size;
+
+	/* Case 3, in the middle */
+	if (item->ch.start != hole->ch.start
+	 && item->ch.start + item->ch.size !=
+	    hole->ch.start + hole->ch.size) {
+		struct cma_bf_item *tail;
+
+		/*
+		 * Space between the end of the chunk and the end of
+		 * the region, ie. space left after the end of the
+		 * chunk.  If this is dividable by alignment we can
+		 * move the chunk to the end of the hole.
+		 */
+		size_t left =
+			hole->ch.start + hole->ch.size -
+			(item->ch.start + item->ch.size);
+		if (left % alignment == 0) {
+			item->ch.start += left;
+			goto case_2;
+		}
+
+		/*
+		 * We are going to add a hole at the end.  This way,
+		 * we will reduce the problem to case 2 -- the chunk
+		 * will be at the end of the hole.
+		 */
+		tail = kmalloc(sizeof *tail, GFP_KERNEL);
+		if (unlikely(!tail)) {
+			kfree(item);
+			return NULL;
+		}
+
+		tail->ch.start = item->ch.start + item->ch.size;
+		tail->ch.size  =
+			hole->ch.start + hole->ch.size - tail->ch.start;
+		tail->ch.reg   = hole->ch.reg;
+
+		if (unlikely(__cma_bf_hole_insert_by_start(tail))) {
+			/*
+			 * Things are broken beyond repair...  Abort
+			 * inserting the hole but still continue with
+			 * allocation (seems like the best we can do).
+			 */
+
+			hole->ch.size = tail->ch.start - hole->ch.start;
+			kfree(tail);
+		} else {
+			__cma_bf_hole_insert_by_size(tail);
+			/*
+			 * It's important that we first insert the new
+			 * hole in the tree sorted by size and later
+			 * reduce the size of the old hole.  We will
+			 * update the position of the old hole in the
+			 * rb tree in code that handles case 2.
+			 */
+			hole->ch.size = tail->ch.start - hole->ch.start;
+		}
+
+		/* Go to case 2 */
+	}
+
+
+	/* Case 2, at the beginning or at the end */
+case_2:
+	/* No need to update the tree; order preserved. */
+	if (item->ch.start == hole->ch.start)
+		hole->ch.start += item->ch.size;
+
+	/* Alter hole's size */
+	hole->ch.size -= size;
+	__cma_bf_hole_erase_by_size(hole);
+	__cma_bf_hole_insert_by_size(hole);
+
+	return item;
+}
+
+
+static void __cma_bf_hole_merge_maybe(struct cma_bf_item *item)
+{
+	struct cma_bf_item *prev;
+	struct rb_node *node;
+	int twice = 2;
+
+	node = rb_prev(&item->ch.by_start);
+	if (unlikely(!node))
+		goto next;
+	prev = rb_entry(node, struct cma_bf_item, ch.by_start);
+
+	for (;;) {
+		if (prev->ch.start + prev->ch.size == item->ch.start) {
+			/* Remove previous hole from trees */
+			__cma_bf_hole_erase_by_size(prev);
+			__cma_bf_hole_erase_by_start(prev);
+
+			/* Alter this hole */
+			item->ch.size += prev->ch.size;
+			item->ch.start = prev->ch.start;
+			__cma_bf_hole_erase_by_size(item);
+			__cma_bf_hole_insert_by_size(item);
+			/*
+			 * No need to update by start trees as we do
+			 * not break sequence order
+			 */
+
+			/* Free prev hole */
+			kfree(prev);
+		}
+
+next:
+		if (!--twice)
+			break;
+
+		node = rb_next(&item->ch.by_start);
+		if (unlikely(!node))
+			break;
+		prev = item;
+		item = rb_entry(node, struct cma_bf_item, ch.by_start);
+	}
+}
+
+
+
+/************************* Register *************************/
+static int cma_bf_module_init(void)
+{
+	static struct cma_allocator alloc = {
+		.name    = "bf",
+		.init    = cma_bf_init,
+		.cleanup = cma_bf_cleanup,
+		.alloc   = cma_bf_alloc,
+		.free    = cma_bf_free,
+	};
+	return cma_allocator_register(&alloc);
+}
+module_init(cma_bf_module_init);
diff --git a/mm/cma.c b/mm/cma.c
new file mode 100644
index 0000000..06d0d5a
--- /dev/null
+++ b/mm/cma.c
@@ -0,0 +1,911 @@
+/*
+ * Contiguous Memory Allocator framework
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/contiguous-memory.txt for details.
+ */
+
+#define pr_fmt(fmt) "cma: " fmt
+
+#ifdef CONFIG_CMA_DEBUG
+#  define DEBUG
+#endif
+
+#ifndef CONFIG_NO_BOOTMEM
+#  include <linux/bootmem.h>   /* alloc_bootmem_pages_nopanic() */
+#endif
+#ifdef CONFIG_HAVE_MEMBLOCK
+#  include <linux/memblock.h>  /* memblock*() */
+#endif
+#include <linux/device.h>      /* struct device, dev_name() */
+#include <linux/errno.h>       /* Error numbers */
+#include <linux/err.h>         /* IS_ERR, PTR_ERR, etc. */
+#include <linux/mm.h>          /* PAGE_ALIGN() */
+#include <linux/module.h>      /* EXPORT_SYMBOL_GPL() */
+#include <linux/mutex.h>       /* mutex */
+#include <linux/slab.h>        /* kmalloc() */
+#include <linux/string.h>      /* str*() */
+
+#include <linux/cma.h>
+
+
+/*
+ * Protects cma_regions, cma_allocators, cma_map, cma_map_length, and
+ * cma_chunks_by_start.
+ */
+static DEFINE_MUTEX(cma_mutex);
+
+
+
+/************************* Map attribute *************************/
+
+static const char *cma_map;
+static size_t cma_map_length;
+
+/*
+ * map-attr      ::= [ rules [ ';' ] ]
+ * rules         ::= rule [ ';' rules ]
+ * rule          ::= patterns '=' regions
+ * patterns      ::= pattern [ ',' patterns ]
+ * regions       ::= REG-NAME [ ',' regions ]
+ * pattern       ::= dev-pattern [ '/' TYPE-NAME ] | '/' TYPE-NAME
+ *
+ * See Documentation/contiguous-memory.txt for details.
+ */
+static ssize_t cma_map_validate(const char *param)
+{
+	const char *ch = param;
+
+	if (*ch == '\0' || *ch == '\n')
+		return 0;
+
+	for (;;) {
+		const char *start = ch;
+
+		while (*ch && *ch != '\n' && *ch != ';' && *ch != '=')
+			++ch;
+
+		if (*ch != '=' || start == ch) {
+			pr_err("map: expecting \"<patterns>=<regions>\" near %s\n",
+			       start);
+			return -EINVAL;
+		}
+
+		while (*++ch != ';')
+			if (*ch == '\0' || *ch == '\n')
+				return ch - param;
+		if (ch[1] == '\0' || ch[1] == '\n')
+			return ch - param;
+		++ch;
+	}
+}
+
+static int __init cma_map_param(char *param)
+{
+	ssize_t len;
+
+	pr_debug("param: map: %s\n", param);
+
+	len = cma_map_validate(param);
+	if (len < 0)
+		return len;
+
+	cma_map = param;
+	cma_map_length = len;
+	return 0;
+}
+
+
+
+/************************* Early regions *************************/
+
+struct list_head cma_early_regions __initdata =
+	LIST_HEAD_INIT(cma_early_regions);
+
+
+int __init __must_check cma_early_region_register(struct cma_region *reg)
+{
+	dma_addr_t start, alignment;
+	size_t size;
+
+	if (reg->alignment & (reg->alignment - 1))
+		return -EINVAL;
+
+	alignment = max(reg->alignment, (dma_addr_t)PAGE_SIZE);
+	start     = ALIGN(reg->start, alignment);
+	size      = PAGE_ALIGN(reg->size);
+
+	if (start + size < start)
+		return -EINVAL;
+
+	reg->size      = size;
+	reg->start     = start;
+	reg->alignment = alignment;
+
+	list_add_tail(&reg->list, &cma_early_regions);
+
+	pr_debug("param: registering early region %s (%p@%p/%p)\n",
+		 reg->name, (void *)reg->size, (void *)reg->start,
+		 (void *)reg->alignment);
+
+	return 0;
+}
+
+
+
+/************************* Regions & Allocators *************************/
+
+static int __cma_region_attach_alloc(struct cma_region *reg);
+
+/* List of all regions.  Named regions are kept before unnamed. */
+static LIST_HEAD(cma_regions);
+
+#define cma_foreach_region(reg) \
+	list_for_each_entry(reg, &cma_regions, list)
+
+int __must_check cma_region_register(struct cma_region *reg)
+{
+	const char *name, *alloc_name;
+	struct cma_region *r;
+	char *ch = NULL;
+	int ret = 0;
+
+	if (!reg->size || reg->start + reg->size < reg->start)
+		return -EINVAL;
+
+	reg->users = 0;
+	reg->used = 0;
+	reg->private_data = NULL;
+	reg->registered = 0;
+	reg->free_space = reg->size;
+
+	/* Copy name and alloc_name */
+	name = reg->name;
+	alloc_name = reg->alloc_name;
+	if (reg->copy_name && (reg->name || reg->alloc_name)) {
+		size_t name_size, alloc_size;
+
+		name_size  = reg->name       ? strlen(reg->name) + 1       : 0;
+		alloc_size = reg->alloc_name ? strlen(reg->alloc_name) + 1 : 0;
+
+		ch = kmalloc(name_size + alloc_size, GFP_KERNEL);
+		if (!ch) {
+			pr_err("%s: not enough memory to allocate name\n",
+			       reg->name ?: "(private)");
+			return -ENOMEM;
+		}
+
+		if (name_size) {
+			memcpy(ch, reg->name, name_size);
+			name = ch;
+			ch += name_size;
+		}
+
+		if (alloc_size) {
+			memcpy(ch, reg->alloc_name, alloc_size);
+			alloc_name = ch;
+		}
+	}
+
+	mutex_lock(&cma_mutex);
+
+	/* Don't let regions overlap */
+	cma_foreach_region(r)
+		if (r->start + r->size > reg->start &&
+		    r->start < reg->start + reg->size) {
+			ret = -EADDRINUSE;
+			goto done;
+		}
+
+	if (reg->alloc) {
+		ret = __cma_region_attach_alloc(reg);
+		if (unlikely(ret < 0))
+			goto done;
+	}
+
+	reg->name = name;
+	reg->alloc_name = alloc_name;
+	reg->registered = 1;
+	ch = NULL;
+
+	/*
+	 * Keep named at the beginning and unnamed (private) at the
+	 * end.  This helps in traversal when named region is looked
+	 * for.
+	 */
+	if (name)
+		list_add(&reg->list, &cma_regions);
+	else
+		list_add_tail(&reg->list, &cma_regions);
+
+done:
+	mutex_unlock(&cma_mutex);
+
+	pr_debug("%s: region %sregistered\n",
+		 reg->name ?: "(private)", ret ? "not " : "");
+	kfree(ch);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(cma_region_register);
+
+static struct cma_region *__must_check
+__cma_region_find(const char **namep)
+{
+	struct cma_region *reg;
+	const char *ch, *name;
+	size_t n;
+
+	ch = *namep;
+	while (*ch && *ch != ',' && *ch != ';')
+		++ch;
+	name = *namep;
+	*namep = *ch == ',' ? ch + 1 : ch;
+	n = ch - name;
+
+	/*
+	 * Named regions are kept in front of unnamed so if we
+	 * encounter unnamed region we can stop.
+	 */
+	cma_foreach_region(reg)
+		if (!reg->name)
+			break;
+		else if (!strncmp(name, reg->name, n) && !reg->name[n])
+			return reg;
+
+	return NULL;
+}
+
+
+/* List of all allocators. */
+static LIST_HEAD(cma_allocators);
+
+#define cma_foreach_allocator(alloc) \
+	list_for_each_entry(alloc, &cma_allocators, list)
+
+int cma_allocator_register(struct cma_allocator *alloc)
+{
+	struct cma_region *reg;
+	int first;
+
+	if (!alloc->alloc || !alloc->free)
+		return -EINVAL;
+
+	mutex_lock(&cma_mutex);
+
+	first = list_empty(&cma_allocators);
+
+	list_add_tail(&alloc->list, &cma_allocators);
+
+	/*
+	 * Attach this allocator to all allocator-less regions that
+	 * request this particular allocator (reg->alloc_name equals
+	 * alloc->name) or if region wants the first available
+	 * allocator and we are the first.
+	 */
+	cma_foreach_region(reg) {
+		if (reg->alloc)
+			continue;
+		if (reg->alloc_name
+		  ? alloc->name && !strcmp(alloc->name, reg->alloc_name)
+		  : (!reg->used && first))
+			continue;
+
+		reg->alloc = alloc;
+		__cma_region_attach_alloc(reg);
+	}
+
+	mutex_unlock(&cma_mutex);
+
+	pr_debug("%s: allocator registered\n", alloc->name ?: "(unnamed)");
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cma_allocator_register);
+
+static struct cma_allocator *__must_check
+__cma_allocator_find(const char *name)
+{
+	struct cma_allocator *alloc;
+
+	if (!name)
+		return list_empty(&cma_allocators)
+			? NULL
+			: list_entry(cma_allocators.next,
+				     struct cma_allocator, list);
+
+	cma_foreach_allocator(alloc)
+		if (alloc->name && !strcmp(name, alloc->name))
+			return alloc;
+
+	return NULL;
+}
+
+
+
+/************************* Initialise CMA *************************/
+
+int __init cma_set_defaults(struct cma_region *regions, const char *map)
+{
+	if (map) {
+		int ret = cma_map_param((char *)map);
+		if (unlikely(ret < 0))
+			return ret;
+	}
+
+	if (!regions)
+		return 0;
+
+	for (; regions->size; ++regions) {
+		int ret = cma_early_region_register(regions);
+		if (unlikely(ret < 0))
+			return ret;
+	}
+
+	return 0;
+}
+
+
+int __init cma_early_region_reserve(struct cma_region *reg)
+{
+	int tried = 0;
+
+	if (!reg->size || (reg->alignment & (reg->alignment - 1)) ||
+	    reg->reserved)
+		return -EINVAL;
+
+#ifndef CONFIG_NO_BOOTMEM
+
+	tried = 1;
+
+	{
+		void *ptr = __alloc_bootmem_nopanic(reg->size, reg->alignment,
+						    reg->start);
+		if (ptr) {
+			reg->start = virt_to_phys(ptr);
+			reg->reserved = 1;
+			return 0;
+		}
+	}
+
+#endif
+
+#ifdef CONFIG_HAVE_MEMBLOCK
+
+	tried = 1;
+
+	if (reg->start) {
+		if (!memblock_is_region_reserved(reg->start, reg->size) &&
+		    memblock_reserve(reg->start, reg->size) >= 0) {
+			reg->reserved = 1;
+			return 0;
+		}
+	} else {
+		/*
+		 * Use __memblock_alloc_base() since
+		 * memblock_alloc_base() panic()s.
+		 */
+		u64 ret = __memblock_alloc_base(reg->size, reg->alignment, 0);
+		if (ret &&
+		    ret < ~(dma_addr_t)0 &&
+		    ret + reg->size < ~(dma_addr_t)0 &&
+		    ret + reg->size > ret) {
+			reg->start = ret;
+			reg->reserved = 1;
+			return 0;
+		}
+
+		if (ret)
+			memblock_free(ret, reg->size);
+	}
+
+#endif
+
+	return tried ? -ENOMEM : -EOPNOTSUPP;
+}
+
+void __init cma_early_regions_reserve(int (*reserve)(struct cma_region *reg))
+{
+	struct cma_region *reg;
+
+	pr_debug("init: reserving early regions\n");
+
+	if (!reserve)
+		reserve = cma_early_region_reserve;
+
+	list_for_each_entry(reg, &cma_early_regions, list) {
+		if (reg->reserved) {
+			/* nothing */
+		} else if (reserve(reg) >= 0) {
+			pr_debug("init: %s: reserved %p@%p\n",
+				 reg->name ?: "(private)",
+				 (void *)reg->size, (void *)reg->start);
+			reg->reserved = 1;
+		} else {
+			pr_warn("init: %s: unable to reserve %p@%p/%p\n",
+				reg->name ?: "(private)",
+				(void *)reg->size, (void *)reg->start,
+				(void *)reg->alignment);
+		}
+	}
+}
+
+
+static int __init cma_init(void)
+{
+	struct cma_region *reg, *n;
+
+	pr_debug("init: initialising\n");
+
+	if (cma_map) {
+		char *val = kmemdup(cma_map, cma_map_length + 1, GFP_KERNEL);
+		cma_map = val;
+		if (!val)
+			return -ENOMEM;
+		val[cma_map_length] = '\0';
+	}
+
+	list_for_each_entry_safe(reg, n, &cma_early_regions, list) {
+		INIT_LIST_HEAD(&reg->list);
+		/*
+		 * We don't care if there was an error.  It's a pity
+		 * but there's not much we can do about it any way.
+		 * If the error is on a region that was parsed from
+		 * command line then it will stay and waste a bit of
+		 * space; if it was registered using
+		 * cma_early_region_register() it's caller's
+		 * responsibility to do something about it.
+		 */
+		if (reg->reserved && cma_region_register(reg) < 0)
+			/* ignore error */;
+	}
+
+	INIT_LIST_HEAD(&cma_early_regions);
+
+	return 0;
+}
+/*
+ * We want to be initialised earlier than module_init/__initcall so
+ * that drivers that want to grab memory at boot time will get CMA
+ * ready.  subsys_initcall() seems early enough and not too early at
+ * the same time.
+ */
+subsys_initcall(cma_init);
+
+
+
+/************************* Chunks *************************/
+
+/* All chunks sorted by start address. */
+static struct rb_root cma_chunks_by_start;
+
+static struct cma_chunk *__must_check __cma_chunk_find(dma_addr_t addr)
+{
+	struct cma_chunk *chunk;
+	struct rb_node *n;
+
+	for (n = cma_chunks_by_start.rb_node; n; ) {
+		chunk = rb_entry(n, struct cma_chunk, by_start);
+		if (addr < chunk->start)
+			n = n->rb_left;
+		else if (addr > chunk->start)
+			n = n->rb_right;
+		else
+			return chunk;
+	}
+	WARN(1, KERN_WARNING "no chunk starting at %p\n", (void *)addr);
+	return NULL;
+}
+
+static int __must_check __cma_chunk_insert(struct cma_chunk *chunk)
+{
+	struct rb_node **new, *parent = NULL;
+	typeof(chunk->start) addr = chunk->start;
+
+	for (new = &cma_chunks_by_start.rb_node; *new; ) {
+		struct cma_chunk *c =
+			container_of(*new, struct cma_chunk, by_start);
+
+		parent = *new;
+		if (addr < c->start) {
+			new = &(*new)->rb_left;
+		} else if (addr > c->start) {
+			new = &(*new)->rb_right;
+		} else {
+			/*
+			 * We should never be here.  If we are it
+			 * means allocator gave us an invalid chunk
+			 * (one that has already been allocated) so we
+			 * refuse to accept it.  Our caller will
+			 * recover by freeing the chunk.
+			 */
+			WARN_ON(1);
+			return -EADDRINUSE;
+		}
+	}
+
+	rb_link_node(&chunk->by_start, parent, new);
+	rb_insert_color(&chunk->by_start, &cma_chunks_by_start);
+
+	return 0;
+}
+
+static void __cma_chunk_free(struct cma_chunk *chunk)
+{
+	rb_erase(&chunk->by_start, &cma_chunks_by_start);
+
+	chunk->reg->alloc->free(chunk);
+	--chunk->reg->users;
+	chunk->reg->free_space += chunk->size;
+}
+
+
+/************************* The Device API *************************/
+
+static const char *__must_check
+__cma_where_from(const struct device *dev, const char *type);
+
+
+/* Allocate. */
+
+static dma_addr_t __must_check
+__cma_alloc_from_region(struct cma_region *reg,
+			size_t size, dma_addr_t alignment)
+{
+	struct cma_chunk *chunk;
+
+	pr_debug("allocate %p/%p from %s\n",
+		 (void *)size, (void *)alignment,
+		 reg ? reg->name ?: "(private)" : "(null)");
+
+	if (!reg || reg->free_space < size)
+		return -ENOMEM;
+
+	if (!reg->alloc) {
+		if (!reg->used)
+			__cma_region_attach_alloc(reg);
+		if (!reg->alloc)
+			return -ENOMEM;
+	}
+
+	chunk = reg->alloc->alloc(reg, size, alignment);
+	if (!chunk)
+		return -ENOMEM;
+
+	if (unlikely(__cma_chunk_insert(chunk) < 0)) {
+		/* We should *never* be here. */
+		chunk->reg->alloc->free(chunk);
+		kfree(chunk);
+		return -EADDRINUSE;
+	}
+
+	chunk->reg = reg;
+	++reg->users;
+	reg->free_space -= chunk->size;
+	pr_debug("allocated at %p\n", (void *)chunk->start);
+	return chunk->start;
+}
+
+dma_addr_t __must_check
+cma_alloc_from_region(struct cma_region *reg,
+		      size_t size, dma_addr_t alignment)
+{
+	dma_addr_t addr;
+
+	pr_debug("allocate %p/%p from %s\n",
+		 (void *)size, (void *)alignment,
+		 reg ? reg->name ?: "(private)" : "(null)");
+
+	if (!size || alignment & (alignment - 1) || !reg)
+		return -EINVAL;
+
+	mutex_lock(&cma_mutex);
+
+	addr = reg->registered ?
+		__cma_alloc_from_region(reg, PAGE_ALIGN(size),
+					max(alignment, (dma_addr_t)PAGE_SIZE)) :
+		-EINVAL;
+
+	mutex_unlock(&cma_mutex);
+
+	return addr;
+}
+EXPORT_SYMBOL_GPL(cma_alloc_from_region);
+
+dma_addr_t __must_check
+__cma_alloc(const struct device *dev, const char *type,
+	    dma_addr_t size, dma_addr_t alignment)
+{
+	struct cma_region *reg;
+	const char *from;
+	dma_addr_t addr;
+
+	if (dev)
+		pr_debug("allocate %p/%p for %s/%s\n",
+			 (void *)size, (void *)alignment,
+			 dev_name(dev), type ?: "");
+
+	if (!size || alignment & (alignment - 1))
+		return -EINVAL;
+
+	size = PAGE_ALIGN(size);
+	if (alignment < PAGE_SIZE)
+		alignment = PAGE_SIZE;
+
+	mutex_lock(&cma_mutex);
+
+	from = __cma_where_from(dev, type);
+	if (unlikely(IS_ERR(from))) {
+		addr = PTR_ERR(from);
+		goto done;
+	}
+
+	pr_debug("allocate %p/%p from one of %s\n",
+		 (void *)size, (void *)alignment, from);
+
+	while (*from && *from != ';') {
+		reg = __cma_region_find(&from);
+		addr = __cma_alloc_from_region(reg, size, alignment);
+		if (!IS_ERR_VALUE(addr))
+			goto done;
+	}
+
+	pr_debug("not enough memory\n");
+	addr = -ENOMEM;
+
+done:
+	mutex_unlock(&cma_mutex);
+
+	return addr;
+}
+EXPORT_SYMBOL_GPL(__cma_alloc);
+
+
+/* Query information about regions. */
+static void __cma_info_add(struct cma_info *infop, struct cma_region *reg)
+{
+	infop->total_size += reg->size;
+	infop->free_size += reg->free_space;
+	if (infop->lower_bound > reg->start)
+		infop->lower_bound = reg->start;
+	if (infop->upper_bound < reg->start + reg->size)
+		infop->upper_bound = reg->start + reg->size;
+	++infop->count;
+}
+
+int
+__cma_info(struct cma_info *infop, const struct device *dev, const char *type)
+{
+	struct cma_info info = { ~(dma_addr_t)0, 0, 0, 0, 0 };
+	struct cma_region *reg;
+	const char *from;
+	int ret;
+
+	if (unlikely(!infop))
+		return -EINVAL;
+
+	mutex_lock(&cma_mutex);
+
+	from = __cma_where_from(dev, type);
+	if (IS_ERR(from)) {
+		ret = PTR_ERR(from);
+		info.lower_bound = 0;
+		goto done;
+	}
+
+	while (*from && *from != ';') {
+		reg = __cma_region_find(&from);
+		if (reg)
+			__cma_info_add(&info, reg);
+	}
+
+	ret = 0;
+done:
+	mutex_unlock(&cma_mutex);
+
+	memcpy(infop, &info, sizeof info);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(__cma_info);
+
+
+/* Freeing. */
+int cma_free(dma_addr_t addr)
+{
+	struct cma_chunk *c;
+	int ret;
+
+	mutex_lock(&cma_mutex);
+
+	c = __cma_chunk_find(addr);
+
+	if (c) {
+		__cma_chunk_free(c);
+		ret = 0;
+	} else {
+		ret = -ENOENT;
+	}
+
+	mutex_unlock(&cma_mutex);
+
+	if (c)
+		pr_debug("free(%p): freed\n", (void *)addr);
+	else
+		pr_err("free(%p): not found\n", (void *)addr);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(cma_free);
+
+
+/************************* Miscellaneous *************************/
+
+static int __cma_region_attach_alloc(struct cma_region *reg)
+{
+	struct cma_allocator *alloc;
+	int ret;
+
+	/*
+	 * If reg->alloc is set then caller wants us to use this
+	 * allocator.  Otherwise we need to find one by name.
+	 */
+	if (reg->alloc) {
+		alloc = reg->alloc;
+	} else {
+		alloc = __cma_allocator_find(reg->alloc_name);
+		if (!alloc) {
+			pr_warn("init: %s: %s: no such allocator\n",
+				reg->name ?: "(private)",
+				reg->alloc_name ?: "(default)");
+			reg->used = 1;
+			return -ENOENT;
+		}
+	}
+
+	/* Try to initialise the allocator. */
+	reg->private_data = NULL;
+	ret = alloc->init ? alloc->init(reg) : 0;
+	if (unlikely(ret < 0)) {
+		pr_err("init: %s: %s: unable to initialise allocator\n",
+		       reg->name ?: "(private)", alloc->name ?: "(unnamed)");
+		reg->alloc = NULL;
+		reg->used = 1;
+	} else {
+		reg->alloc = alloc;
+		pr_debug("init: %s: %s: initialised allocator\n",
+			 reg->name ?: "(private)", alloc->name ?: "(unnamed)");
+	}
+	return ret;
+}
+
+
+/*
+ * s            ::= rules
+ * rules        ::= rule [ ';' rules ]
+ * rule         ::= patterns '=' regions
+ * patterns     ::= pattern [ ',' patterns ]
+ * regions      ::= REG-NAME [ ',' regions ]
+ * pattern      ::= dev-pattern [ '/' TYPE-NAME ] | '/' TYPE-NAME
+ */
+static const char *__must_check
+__cma_where_from(const struct device *dev, const char *type)
+{
+	/*
+	 * This function matches the pattern from the map attribute
+	 * agains given device name and type.  Type may be of course
+	 * NULL or an emtpy string.
+	 */
+
+	const char *s, *name;
+	int name_matched = 0;
+
+	/*
+	 * If dev is NULL we were called in alternative form where
+	 * type is the from string.  All we have to do is return it.
+	 */
+	if (!dev)
+		return type ?: ERR_PTR(-EINVAL);
+
+	if (!cma_map)
+		return ERR_PTR(-ENOENT);
+
+	name = dev_name(dev);
+	if (WARN_ON(!name || !*name))
+		return ERR_PTR(-EINVAL);
+
+	if (!type)
+		type = "common";
+
+	/*
+	 * Now we go throught the cma_map attribute.
+	 */
+	for (s = cma_map; *s; ++s) {
+		const char *c;
+
+		/*
+		 * If the pattern starts with a slash, the device part of the
+		 * pattern matches if it matched previously.
+		 */
+		if (*s == '/') {
+			if (!name_matched)
+				goto look_for_next;
+			goto match_type;
+		}
+
+		/*
+		 * We are now trying to match the device name.  This also
+		 * updates the name_matched variable.  If, while reading the
+		 * spec, we ecnounter comma it means that the pattern does not
+		 * match and we need to start over with another pattern (the
+		 * one afther the comma).  If we encounter equal sign we need
+		 * to start over with another rule.  If there is a character
+		 * that does not match, we neet to look for a comma (to get
+		 * another pattern) or semicolon (to get another rule) and try
+		 * again if there is one somewhere.
+		 */
+
+		name_matched = 0;
+
+		for (c = name; *s != '*' && *c; ++c, ++s)
+			if (*s == '=')
+				goto next_rule;
+			else if (*s == ',')
+				goto next_pattern;
+			else if (*s != '?' && *c != *s)
+				goto look_for_next;
+		if (*s == '*')
+			++s;
+
+		name_matched = 1;
+
+		/*
+		 * Now we need to match the type part of the pattern.  If the
+		 * pattern is missing it we match only if type points to an
+		 * empty string.  Otherwise wy try to match it just like name.
+		 */
+		if (*s == '/') {
+match_type:		/* s points to '/' */
+			++s;
+
+			for (c = type; *s && *c; ++c, ++s)
+				if (*s == '=')
+					goto next_rule;
+				else if (*s == ',')
+					goto next_pattern;
+				else if (*c != *s)
+					goto look_for_next;
+		}
+
+		/* Return the string behind the '=' sign of the rule. */
+		if (*s == '=')
+			return s + 1;
+		else if (*s == ',')
+			return strchr(s, '=') + 1;
+
+		/* Pattern did not match */
+
+look_for_next:
+		do {
+			++s;
+		} while (*s != ',' && *s != '=');
+		if (*s == ',')
+			continue;
+
+next_rule:	/* s points to '=' */
+		s = strchr(s, ';');
+		if (!s)
+			break;
+
+next_pattern:
+		continue;
+	}
+
+	return ERR_PTR(-ENOENT);
+}
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 2/9] mm: cma: Contiguous Memory Allocator added
@ 2010-09-06  6:33   ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

The Contiguous Memory Allocator framework is a set of APIs for
allocating physically contiguous chunks of memory.

Various chips require contiguous blocks of memory to operate.  Those
chips include devices such as cameras, hardware video decoders and
encoders, etc.

The code is highly modular and customisable to suit the needs of
various users.  Set of regions reserved for CMA can be configured
per-platform and it is easy to add custom allocator algorithms if one
has such need.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Pawel Osciak <p.osciak@samsung.com>
---
 Documentation/00-INDEX              |    2 +
 Documentation/contiguous-memory.txt |  544 +++++++++++++++++++++
 include/linux/cma.h                 |  434 +++++++++++++++++
 mm/Kconfig                          |   41 ++
 mm/Makefile                         |    2 +
 mm/cma-best-fit.c                   |  407 ++++++++++++++++
 mm/cma.c                            |  911 +++++++++++++++++++++++++++++++++++
 7 files changed, 2341 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/contiguous-memory.txt
 create mode 100644 include/linux/cma.h
 create mode 100644 mm/cma-best-fit.c
 create mode 100644 mm/cma.c

diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
index 8dfc670..f93e787 100644
--- a/Documentation/00-INDEX
+++ b/Documentation/00-INDEX
@@ -94,6 +94,8 @@ connector/
 	- docs on the netlink based userspace<->kernel space communication mod.
 console/
 	- documentation on Linux console drivers.
+contiguous-memory.txt
+	- documentation on physically-contiguous memory allocation framework.
 cpu-freq/
 	- info on CPU frequency and voltage scaling.
 cpu-hotplug.txt
diff --git a/Documentation/contiguous-memory.txt b/Documentation/contiguous-memory.txt
new file mode 100644
index 0000000..e470c6f
--- /dev/null
+++ b/Documentation/contiguous-memory.txt
@@ -0,0 +1,544 @@
+                                                             -*- org -*-
+
+* Contiguous Memory Allocator
+
+   The Contiguous Memory Allocator (CMA) is a framework, which allows
+   setting up a machine-specific configuration for physically-contiguous
+   memory management. Memory for devices is then allocated according
+   to that configuration.
+
+   The main role of the framework is not to allocate memory, but to
+   parse and manage memory configurations, as well as to act as an
+   in-between between device drivers and pluggable allocators. It is
+   thus not tied to any memory allocation method or strategy.
+
+** Why is it needed?
+
+    Various devices on embedded systems have no scatter-getter and/or
+    IO map support and as such require contiguous blocks of memory to
+    operate.  They include devices such as cameras, hardware video
+    decoders and encoders, etc.
+
+    Such devices often require big memory buffers (a full HD frame is,
+    for instance, more then 2 mega pixels large, i.e. more than 6 MB
+    of memory), which makes mechanisms such as kmalloc() ineffective.
+
+    Some embedded devices impose additional requirements on the
+    buffers, e.g. they can operate only on buffers allocated in
+    particular location/memory bank (if system has more than one
+    memory bank) or buffers aligned to a particular memory boundary.
+
+    Development of embedded devices have seen a big rise recently
+    (especially in the V4L area) and many such drivers include their
+    own memory allocation code. Most of them use bootmem-based methods.
+    CMA framework is an attempt to unify contiguous memory allocation
+    mechanisms and provide a simple API for device drivers, while
+    staying as customisable and modular as possible.
+
+** Design
+
+    The main design goal for the CMA was to provide a customisable and
+    modular framework, which could be configured to suit the needs of
+    individual systems.  Configuration specifies a list of memory
+    regions, which then are assigned to devices.  Memory regions can
+    be shared among many device drivers or assigned exclusively to
+    one.  This has been achieved in the following ways:
+
+    1. The core of the CMA does not handle allocation of memory and
+       management of free space.  Dedicated allocators are used for
+       that purpose.
+
+       This way, if the provided solution does not match demands
+       imposed on a given system, one can develop a new algorithm and
+       easily plug it into the CMA framework.
+
+       The presented solution includes an implementation of a best-fit
+       algorithm.
+
+    2. When requesting memory, devices have to introduce themselves.
+       This way CMA knows who the memory is allocated for.  This
+       allows for the system architect to specify which memory regions
+       each device should use.
+
+    3. Memory regions are grouped in various "types".  When device
+       requests a chunk of memory, it can specify what type of memory
+       it needs.  If no type is specified, "common" is assumed.
+
+       This makes it possible to configure the system in such a way,
+       that a single device may get memory from different memory
+       regions, depending on the "type" of memory it requested.  For
+       example, a video codec driver might want to allocate some
+       shared buffers from the first memory bank and the other from
+       the second to get the highest possible memory throughput.
+
+    4. For greater flexibility and extensibility, the framework allows
+       device drivers to register private regions of reserved memory
+       which then may be used only by them.
+
+       As an effect, if a driver would not use the rest of the CMA
+       interface, it can still use CMA allocators and other
+       mechanisms.
+
+       4a. Early in boot process, device drivers can also request the
+           CMA framework to a reserve a region of memory for them
+           which then will be used as a private region.
+
+           This way, drivers do not need to directly call bootmem,
+           memblock or similar early allocator but merely register an
+           early region and the framework will handle the rest
+           including choosing the right early allocator.
+
+** Use cases
+
+    Let's analyse some imaginary system that uses the CMA to see how
+    the framework can be used and configured.
+
+
+    We have a platform with a hardware video decoder and a camera each
+    needing 20 MiB of memory in the worst case.  Our system is written
+    in such a way though that the two devices are never used at the
+    same time and memory for them may be shared.  In such a system the
+    following configuration would be used in the platform
+    initialisation code:
+
+        static struct cma_region regions[] = {
+                { .name = "region", .size = 20 << 20 },
+                { }
+        }
+        static const char map[] __initconst = "video,camera=region";
+
+        cma_set_defaults(regions, map);
+
+    The regions array defines a single 20-MiB region named "region".
+    The map says that drivers named "video" and "camera" are to be
+    granted memory from the previously defined region.
+
+    A shorter map can be used as well:
+
+        static const char map[] __initconst = "*=region";
+
+    The asterisk ("*") matches all devices thus all devices will use
+    the region named "region".
+
+    We can see, that because the devices share the same memory region,
+    we save 20 MiB, compared to the situation when each of the devices
+    would reserve 20 MiB of memory for itself.
+
+
+    Now, let's say that we have also many other smaller devices and we
+    want them to share some smaller pool of memory.  For instance 5
+    MiB.  This can be achieved in the following way:
+
+        static struct cma_region regions[] = {
+                { .name = "region", .size = 20 << 20 },
+                { .name = "common", .size =  5 << 20 },
+                { }
+        }
+        static const char map[] __initconst =
+                "video,camera=region;*=common";
+
+        cma_set_defaults(regions, map);
+
+    This instructs CMA to reserve two regions and let video and camera
+    use region "region" whereas all other devices should use region
+    "common".
+
+
+    Later on, after some development of the system, it can now run
+    video decoder and camera at the same time.  The 20 MiB region is
+    no longer enough for the two to share.  A quick fix can be made to
+    grant each of those devices separate regions:
+
+        static struct cma_region regions[] = {
+                { .name = "v", .size = 20 << 20 },
+                { .name = "c", .size = 20 << 20 },
+                { .name = "common", .size =  5 << 20 },
+                { }
+        }
+        static const char map[] __initconst = "video=v;camera=c;*=common";
+
+        cma_set_defaults(regions, map);
+
+    This solution also shows how with CMA you can assign private pools
+    of memory to each device if that is required.
+
+
+    Allocation mechanisms can be replaced dynamically in a similar
+    manner as well. Let's say that during testing, it has been
+    discovered that, for a given shared region of 40 MiB,
+    fragmentation has become a problem.  It has been observed that,
+    after some time, it becomes impossible to allocate buffers of the
+    required sizes. So to satisfy our requirements, we would have to
+    reserve a larger shared region beforehand.
+
+    But fortunately, you have also managed to develop a new allocation
+    algorithm -- Neat Allocation Algorithm or "na" for short -- which
+    satisfies the needs for both devices even on a 30 MiB region.  The
+    configuration can be then quickly changed to:
+
+        static struct cma_region regions[] = {
+                { .name = "region", .size = 30 << 20, .alloc_name = "na" },
+                { .name = "common", .size =  5 << 20 },
+                { }
+        }
+        static const char map[] __initconst = "video,camera=region;*=common";
+
+        cma_set_defaults(regions, map);
+
+    This shows how you can develop your own allocation algorithms if
+    the ones provided with CMA do not suit your needs and easily
+    replace them, without the need to modify CMA core or even
+    recompiling the kernel.
+
+** Technical Details
+
+*** The attributes
+
+    As shown above, CMA is configured by a two attributes: list
+    regions and map.  The first one specifies regions that are to be
+    reserved for CMA.  The second one specifies what regions each
+    device is assigned to.
+
+**** Regions
+
+     Regions is a list of regions terminated by a region with size
+     equal zero.  The following fields may be set:
+
+     - size       -- size of the region (required, must not be zero)
+     - alignment  -- alignment of the region; must be power of two or
+                     zero (optional)
+     - start      -- where the region has to start (optional)
+     - alloc_name -- the name of allocator to use (optional)
+     - alloc      -- allocator to use (optional; and besides
+                     alloc_name is probably is what you want)
+
+     size, alignment and start is specified in bytes.  Size will be
+     aligned up to a PAGE_SIZE.  If alignment is less then a PAGE_SIZE
+     it will be set to a PAGE_SIZE.  start will be aligned to
+     alignment.
+
+**** Map
+
+     The format of the "map" attribute is as follows:
+
+         map-attr      ::= [ rules [ ';' ] ]
+         rules         ::= rule [ ';' rules ]
+         rule          ::= patterns '=' regions
+
+         patterns      ::= pattern [ ',' patterns ]
+
+         regions       ::= REG-NAME [ ',' regions ]
+                       // list of regions to try to allocate memory
+                       // from
+
+         pattern       ::= dev-pattern [ '/' TYPE-NAME ] | '/' TYPE-NAME
+                       // pattern request must match for the rule to
+                       // apply; the first rule that matches is
+                       // applied; if dev-pattern part is omitted
+                       // value identical to the one used in previous
+                       // pattern is assumed.
+
+         dev-pattern   ::= PATTERN
+                       // pattern that device name must match for the
+                       // rule to apply; may contain question marks
+                       // which mach any characters and end with an
+                       // asterisk which match the rest of the string
+                       // (including nothing).
+
+     It is a sequence of rules which specify what regions should given
+     (device, type) pair use.  The first rule that matches is applied.
+
+     For rule to match, the pattern must match (dev, type) pair.
+     Pattern consist of the part before and after slash.  The first
+     part must match device name and the second part must match kind.
+
+     If the first part is empty, the device name is assumed to match
+     iff it matched in previous pattern.  If the second part is
+     omitted it will mach any type of memory requested by device.
+
+     Some examples (whitespace added for better readability):
+
+         cma_map = foo/quaz = r1;
+                       // device foo with type == "quaz" uses region r1
+
+                   foo/* = r2;     // OR:
+                   /* = r2;
+                       // device foo with any other kind uses region r2
+
+                   bar = r1,r2;
+                       // device bar uses region r1 or r2
+
+                   baz?/a , baz?/b = r3;
+                       // devices named baz? where ? is any character
+                       // with type being "a" or "b" use r3
+
+*** The device and types of memory
+
+    The name of the device is taken from the device structure.  It is
+    not possible to use CMA if driver does not register a device
+    (actually this can be overcome if a fake device structure is
+    provided with at least the name set).
+
+    The type of memory is an optional argument provided by the device
+    whenever it requests memory chunk.  In many cases this can be
+    ignored but sometimes it may be required for some devices.
+
+    For instance, let's say that there are two memory banks and for
+    performance reasons a device uses buffers in both of them.
+    Platform defines a memory types "a" and "b" for regions in both
+    banks.  The device driver would use those two types then to
+    request memory chunks from different banks.  CMA attributes could
+    look as follows:
+
+         static struct cma_region regions[] = {
+                 { .name = "a", .size = 32 << 20 },
+                 { .name = "b", .size = 32 << 20, .start = 512 << 20 },
+                 { }
+         }
+         static const char map[] __initconst = "foo/a=a;foo/b=b;*=a,b";
+
+    And whenever the driver allocated the memory it would specify the
+    kind of memory:
+
+        buffer1 = cma_alloc(dev, "a", 1 << 20, 0);
+        buffer2 = cma_alloc(dev, "b", 1 << 20, 0);
+
+    If it was needed to try to allocate from the other bank as well if
+    the dedicated one is full, the map attributes could be changed to:
+
+         static const char map[] __initconst = "foo/a=a,b;foo/b=b,a;*=a,b";
+
+    On the other hand, if the same driver was used on a system with
+    only one bank, the configuration could be changed just to:
+
+         static struct cma_region regions[] = {
+                 { .name = "r", .size = 64 << 20 },
+                 { }
+         }
+         static const char map[] __initconst = "*=r";
+
+    without the need to change the driver at all.
+
+*** Device API
+
+    There are three basic calls provided by the CMA framework to
+    devices.  To allocate a chunk of memory cma_alloc() function needs
+    to be used:
+
+        dma_addr_t cma_alloc(const struct device *dev, const char *type,
+                             size_t size, dma_addr_t alignment);
+
+    If required, device may specify alignment in bytes that the chunk
+    need to satisfy.  It have to be a power of two or zero.  The
+    chunks are always aligned at least to a page.
+
+    The type specifies the type of memory as described to in the
+    previous subsection.  If device driver does not care about memory
+    type it can safely pass NULL as the type which is the same as
+    possing "common".
+
+    The basic usage of the function is just a:
+
+        addr = cma_alloc(dev, NULL, size, 0);
+
+    The function returns bus address of allocated chunk or a value
+    that evaluates to true if checked with IS_ERR_VALUE(), so the
+    correct way for checking for errors is:
+
+        unsigned long addr = cma_alloc(dev, NULL, size, 0);
+        if (IS_ERR_VALUE(addr))
+                /* Error */
+                return (int)addr;
+        /* Allocated */
+
+    (Make sure to include <linux/err.h> which contains the definition
+    of the IS_ERR_VALUE() macro.)
+
+
+    Allocated chunk is freed via a cma_free() function:
+
+        int cma_free(dma_addr_t addr);
+
+    It takes bus address of the chunk as an argument frees it.
+
+
+    The last function is the cma_info() which returns information
+    about regions assigned to given (dev, type) pair.  Its syntax is:
+
+        int cma_info(struct cma_info *info,
+                     const struct device *dev,
+                     const char *type);
+
+    On successful exit it fills the info structure with lower and
+    upper bound of regions, total size and number of regions assigned
+    to given (dev, type) pair.
+
+**** Dynamic and private regions
+
+     In the basic setup, regions are provided and initialised by
+     platform initialisation code (which usually use
+     cma_set_defaults() for that purpose).
+
+     It is, however, possible to create and add regions dynamically
+     using cma_region_register() function.
+
+         int cma_region_register(struct cma_region *reg);
+
+     The region does not have to have name.  If it does not, it won't
+     be accessed via standard mapping (the one provided with map
+     attribute).  Such regions are private and to allocate chunk from
+     them, one needs to call:
+
+         dma_addr_t cma_alloc_from_region(struct cma_region *reg,
+                                          size_t size, dma_addr_t alignment);
+
+     It is just like cma_alloc() expect one specifies what region to
+     allocate memory from.  The region must have been registered.
+
+**** Allocating from region specified by name
+
+     If a driver preferred allocating from a region or list of regions
+     it knows name of it can use a different call simmilar to the
+     previous:
+
+         dma_addr_t cma_alloc_from(const char *regions,
+                                   size_t size, dma_addr_t alignment);
+
+     The first argument is a comma-separated list of regions the
+     driver desires CMA to try and allocate from.  The list is
+     terminated by a NUL byte or a semicolon.
+
+     Similarly, there is a call for requesting information about named
+     regions:
+
+        int cma_info_about(struct cma_info *info, const char *regions);
+
+     Generally, it should not be needed to use those interfaces but
+     they are provided nevertheless.
+
+**** Registering early regions
+
+     An early region is a region that is managed by CMA early during
+     boot process.  It's platforms responsibility to reserve memory
+     for early regions.  Later on, when CMA initialises, early regions
+     with reserved memory are registered as normal regions.
+     Registering an early region may be a way for a device to request
+     a private pool of memory without worrying about actually
+     reserving the memory:
+
+         int cma_early_region_register(struct cma_region *reg);
+
+     This needs to be done quite early on in boot process, before
+     platform traverses the cma_early_regions list to reserve memory.
+
+     When boot process ends, device driver may see whether the region
+     was reserved (by checking reg->reserved flag) and if so, whether
+     it was successfully registered as a normal region (by checking
+     the reg->registered flag).  If that is the case, device driver
+     can use normal API calls to use the region.
+
+*** Allocator operations
+
+    Creating an allocator for CMA needs four functions to be
+    implemented.
+
+
+    The first two are used to initialise an allocator for given driver
+    and clean up afterwards:
+
+        int  cma_foo_init(struct cma_region *reg);
+        void cma_foo_cleanup(struct cma_region *reg);
+
+    The first is called when allocator is attached to region.  When
+    the function is called, the cma_region structure is fully
+    initialised (ie. starting address and size have correct values).
+    As a meter of fact, allocator should never modify the cma_region
+    structure other then the private_data field which it may use to
+    point to it's private data.
+
+    The second call cleans up and frees all resources the allocator
+    has allocated for the region.  The function can assume that all
+    chunks allocated form this region have been freed thus the whole
+    region is free.
+
+
+    The two other calls are used for allocating and freeing chunks.
+    They are:
+
+        struct cma_chunk *cma_foo_alloc(struct cma_region *reg,
+                                        size_t size, dma_addr_t alignment);
+        void cma_foo_free(struct cma_chunk *chunk);
+
+    As names imply the first allocates a chunk and the other frees
+    a chunk of memory.  It also manages a cma_chunk object
+    representing the chunk in physical memory.
+
+    Either of those function can assume that they are the only thread
+    accessing the region.  Therefore, allocator does not need to worry
+    about concurrency.  Moreover, all arguments are guaranteed to be
+    valid (i.e. page aligned size, a power of two alignment no lower
+    the a page size).
+
+
+    When allocator is ready, all that is left is to register it by
+    calling cma_allocator_register() function:
+
+            int cma_allocator_register(struct cma_allocator *alloc);
+
+    The argument is an structure with pointers to the above functions
+    and allocator's name.  The whole call may look something like
+    this:
+
+        static struct cma_allocator alloc = {
+                .name    = "foo",
+                .init    = cma_foo_init,
+                .cleanup = cma_foo_cleanup,
+                .alloc   = cma_foo_alloc,
+                .free    = cma_foo_free,
+        };
+        return cma_allocator_register(&alloc);
+
+    The name ("foo") will be used when a this particular allocator is
+    requested as an allocator for given region.
+
+*** Integration with platform
+
+    There is one function that needs to be called form platform
+    initialisation code.  That is the cma_early_regions_reserve()
+    function:
+
+        void cma_early_regions_reserve(int (*reserve)(struct cma_region *reg));
+
+    It traverses list of all of the early regions provided by platform
+    and registered by drivers and reserves memory for them.  The only
+    argument is a callback function used to reserve the region.
+    Passing NULL as the argument is the same as passing
+    cma_early_region_reserve() function which uses bootmem and
+    memblock for allocating.
+
+    Alternatively, platform code could traverse the cma_early_regions
+    list by itself but this should never be necessary.
+
+
+    Platform has also a way of providing default attributes for CMA,
+    cma_set_defaults() function is used for that purpose:
+
+        int cma_set_defaults(struct cma_region *regions, const char *map)
+
+    It needs to be called prior to reserving regions.  It let one
+    specify the list of regions defined by platform and the map
+    attribute.  The map may point to a string in __initdata.  See
+    above in this document for example usage of this function.
+
+** Future work
+
+    In the future, implementation of mechanisms that would allow the
+    free space inside the regions to be used as page cache, filesystem
+    buffers or swap devices is planned.  With such mechanisms, the
+    memory would not be wasted when not used.
+
+    Because all allocations and freeing of chunks pass the CMA
+    framework it can follow what parts of the reserved memory are
+    freed and what parts are allocated.  Tracking the unused memory
+    would let CMA use it for other purposes such as page cache, I/O
+    buffers, swap, etc.
diff --git a/include/linux/cma.h b/include/linux/cma.h
new file mode 100644
index 0000000..f6f9cb5
--- /dev/null
+++ b/include/linux/cma.h
@@ -0,0 +1,434 @@
+#ifndef __LINUX_CMA_H
+#define __LINUX_CMA_H
+
+/*
+ * Contiguous Memory Allocator framework
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ */
+
+/*
+ * See Documentation/contiguous-memory.txt for details.
+ */
+
+/***************************** Kernel level API *****************************/
+
+#ifdef __KERNEL__
+
+#include <linux/rbtree.h>
+#include <linux/list.h>
+#include <linux/init.h>
+
+
+struct device;
+struct cma_info;
+
+/*
+ * Don't call it directly, use cma_alloc(), cma_alloc_from() or
+ * cma_alloc_from_region().
+ */
+dma_addr_t __must_check
+__cma_alloc(const struct device *dev, const char *type,
+	    size_t size, dma_addr_t alignment);
+
+/* Don't call it directly, use cma_info() or cma_info_about(). */
+int
+__cma_info(struct cma_info *info, const struct device *dev, const char *type);
+
+
+/**
+ * cma_alloc - allocates contiguous chunk of memory.
+ * @dev:	The device to perform allocation for.
+ * @type:	A type of memory to allocate.  Platform may define
+ *		several different types of memory and device drivers
+ *		can then request chunks of different types.  Usually it's
+ *		safe to pass NULL here which is the same as passing
+ *		"common".
+ * @size:	Size of the memory to allocate in bytes.
+ * @alignment:	Desired alignment in bytes.  Must be a power of two or
+ *		zero.  If alignment is less then a page size it will be
+ *		set to page size. If unsure, pass zero here.
+ *
+ * On error returns a negative error cast to dma_addr_t.  Use
+ * IS_ERR_VALUE() to check if returned value is indeed an error.
+ * Otherwise bus address of the chunk is returned.
+ */
+static inline dma_addr_t __must_check
+cma_alloc(const struct device *dev, const char *type,
+	  size_t size, dma_addr_t alignment)
+{
+	return dev ? __cma_alloc(dev, type, size, alignment) : -EINVAL;
+}
+
+
+/**
+ * struct cma_info - information about regions returned by cma_info().
+ * @lower_bound:	The smallest address that is possible to be
+ *			allocated for given (dev, type) pair.
+ * @upper_bound:	The one byte after the biggest address that is
+ *			possible to be allocated for given (dev, type)
+ *			pair.
+ * @total_size:	Total size of regions mapped to (dev, type) pair.
+ * @free_size:	Total free size in all of the regions mapped to (dev, type)
+ *		pair.  Because of possible race conditions, it is not
+ *		guaranteed that the value will be correct -- it gives only
+ *		an approximation.
+ * @count:	Number of regions mapped to (dev, type) pair.
+ */
+struct cma_info {
+	dma_addr_t lower_bound, upper_bound;
+	size_t total_size, free_size;
+	unsigned count;
+};
+
+/**
+ * cma_info - queries information about regions.
+ * @info:	Pointer to a structure where to save the information.
+ * @dev:	The device to query information for.
+ * @type:	A type of memory to query information for.
+ *		If unsure, pass NULL here which is equal to passing
+ *		"common".
+ *
+ * On error returns a negative error, zero otherwise.
+ */
+static inline int
+cma_info(struct cma_info *info, const struct device *dev, const char *type)
+{
+	return dev ? __cma_info(info, dev, type) : -EINVAL;
+}
+
+
+/**
+ * cma_free - frees a chunk of memory.
+ * @addr:	Beginning of the chunk.
+ *
+ * Returns -ENOENT if there is no chunk at given location; otherwise
+ * zero.  In the former case issues a warning.
+ */
+int cma_free(dma_addr_t addr);
+
+
+
+/****************************** Lower lever API *****************************/
+
+/**
+ * cma_alloc_from - allocates contiguous chunk of memory from named regions.
+ * @regions:	Comma separated list of region names.  Terminated by NUL
+ *		byte or a semicolon.
+ * @size:	Size of the memory to allocate in bytes.
+ * @alignment:	Desired alignment in bytes.  Must be a power of two or
+ *		zero.  If alignment is less then a page size it will be
+ *		set to page size. If unsure, pass zero here.
+ *
+ * On error returns a negative error cast to dma_addr_t.  Use
+ * IS_ERR_VALUE() to check if returned value is indeed an error.
+ * Otherwise bus address of the chunk is returned.
+ */
+static inline dma_addr_t __must_check
+cma_alloc_from(const char *regions, size_t size, dma_addr_t alignment)
+{
+	return __cma_alloc(NULL, regions, size, alignment);
+}
+
+/**
+ * cma_info_about - queries information about named regions.
+ * @info:	Pointer to a structure where to save the information.
+ * @regions:	Comma separated list of region names.  Terminated by NUL
+ *		byte or a semicolon.
+ *
+ * On error returns a negative error, zero otherwise.
+ */
+static inline int
+cma_info_about(struct cma_info *info, const const char *regions)
+{
+	return __cma_info(info, NULL, regions);
+}
+
+
+
+struct cma_allocator;
+
+/**
+ * struct cma_region - a region reserved for CMA allocations.
+ * @name:	Unique name of the region.  Read only.
+ * @start:	Bus address of the region in bytes.  Always aligned at
+ *		least to a full page.  Read only.
+ * @size:	Size of the region in bytes.  Multiply of a page size.
+ *		Read only.
+ * @free_space:	Free space in the region.  Read only.
+ * @alignment:	Desired alignment of the region in bytes.  A power of two,
+ *		always at least page size.  Early.
+ * @alloc:	Allocator used with this region.  NULL means allocator is
+ *		not attached.  Private.
+ * @alloc_name:	Allocator name read from cmdline.  Private.  This may be
+ *		different from @alloc->name.
+ * @private_data:	Allocator's private data.
+ * @users:	Number of chunks allocated in this region.
+ * @list:	Entry in list of regions.  Private.
+ * @used:	Whether region was already used, ie. there was at least
+ *		one allocation request for.  Private.
+ * @registered:	Whether this region has been registered.  Read only.
+ * @reserved:	Whether this region has been reserved.  Early.  Read only.
+ * @copy_name:	Whether @name and @alloc_name needs to be copied when
+ *		this region is converted from early to normal.  Early.
+ *		Private.
+ * @free_alloc_name:	Whether @alloc_name was kmalloced().  Private.
+ *
+ * Regions come in two types: an early region and normal region.  The
+ * former can be reserved or not-reserved.  Fields marked as "early"
+ * are only meaningful in early regions.
+ *
+ * Early regions are important only during initialisation.  The list
+ * of early regions is built from the "cma" command line argument or
+ * platform defaults.  Platform initialisation code is responsible for
+ * reserving space for unreserved regions that are placed on
+ * cma_early_regions list.
+ *
+ * Later, during CMA initialisation all reserved regions from the
+ * cma_early_regions list are registered as normal regions and can be
+ * used using standard mechanisms.
+ */
+struct cma_region {
+	const char *name;
+	dma_addr_t start;
+	size_t size;
+	union {
+		size_t free_space;	/* Normal region */
+		dma_addr_t alignment;	/* Early region */
+	};
+
+	struct cma_allocator *alloc;
+	const char *alloc_name;
+	void *private_data;
+
+	unsigned users;
+	struct list_head list;
+
+	unsigned used:1;
+	unsigned registered:1;
+	unsigned reserved:1;
+	unsigned copy_name:1;
+	unsigned free_alloc_name:1;
+};
+
+
+/**
+ * cma_region_register() - registers a region.
+ * @reg:	Region to region.
+ *
+ * Region's start and size must be set.
+ *
+ * If name is set the region will be accessible using normal mechanism
+ * like mapping or cma_alloc_from() function otherwise it will be
+ * a private region and accessible only using the
+ * cma_alloc_from_region() function.
+ *
+ * If alloc is set function will try to initialise given allocator
+ * (and will return error if it failes).  Otherwise alloc_name may
+ * point to a name of an allocator to use (if not set, the default
+ * will be used).
+ *
+ * All other fields are ignored and/or overwritten.
+ *
+ * Returns zero or negative error.  In particular, -EADDRINUSE if
+ * region overlap with already existing region.
+ */
+int __must_check cma_region_register(struct cma_region *reg);
+
+/**
+ * cma_region_unregister() - unregisters a region.
+ * @reg:	Region to unregister.
+ *
+ * Region is unregistered only if there are no chunks allocated for
+ * it.  Otherwise, function returns -EBUSY.
+ *
+ * On success returs zero.
+ */
+int __must_check cma_region_unregister(struct cma_region *reg);
+
+
+/**
+ * cma_alloc_from_region() - allocates contiguous chunk of memory from region.
+ * @reg:	Region to allocate chunk from.
+ * @size:	Size of the memory to allocate in bytes.
+ * @alignment:	Desired alignment in bytes.  Must be a power of two or
+ *		zero.  If alignment is less then a page size it will be
+ *		set to page size. If unsure, pass zero here.
+ *
+ * On error returns a negative error cast to dma_addr_t.  Use
+ * IS_ERR_VALUE() to check if returned value is indeed an error.
+ * Otherwise bus address of the chunk is returned.
+ */
+dma_addr_t __must_check
+cma_alloc_from_region(struct cma_region *reg,
+		      size_t size, dma_addr_t alignment);
+
+
+
+/****************************** Allocators API ******************************/
+
+/**
+ * struct cma_chunk - an allocated contiguous chunk of memory.
+ * @start:	Bus address in bytes.
+ * @size:	Size in bytes.
+ * @free_space:	Free space in region in bytes.  Read only.
+ * @reg:	Region this chunk belongs to.
+ * @by_start:	A node in an red-black tree with all chunks sorted by
+ *		start address.
+ *
+ * The cma_allocator::alloc() operation need to set only the @start
+ * and @size fields.  The rest is handled by the caller (ie. CMA
+ * glue).
+ */
+struct cma_chunk {
+	dma_addr_t start;
+	size_t size;
+
+	struct cma_region *reg;
+	struct rb_node by_start;
+};
+
+
+/**
+ * struct cma_allocator - a CMA allocator.
+ * @name:	Allocator's unique name
+ * @init:	Initialises an allocator on given region.
+ * @cleanup:	Cleans up after init.  May assume that there are no chunks
+ *		allocated in given region.
+ * @alloc:	Allocates a chunk of memory of given size in bytes and
+ *		with given alignment.  Alignment is a power of
+ *		two (thus non-zero) and callback does not need to check it.
+ *		May also assume that it is the only call that uses given
+ *		region (ie. access to the region is synchronised with
+ *		a mutex).  This has to allocate the chunk object (it may be
+ *		contained in a bigger structure with allocator-specific data.
+ *		Required.
+ * @free:	Frees allocated chunk.  May also assume that it is the only
+ *		call that uses given region.  This has to free() the chunk
+ *		object as well.  Required.
+ * @list:	Entry in list of allocators.  Private.
+ */
+struct cma_allocator {
+	const char *name;
+
+	int (*init)(struct cma_region *reg);
+	void (*cleanup)(struct cma_region *reg);
+	struct cma_chunk *(*alloc)(struct cma_region *reg, size_t size,
+				   dma_addr_t alignment);
+	void (*free)(struct cma_chunk *chunk);
+
+	struct list_head list;
+};
+
+
+/**
+ * cma_allocator_register() - Registers an allocator.
+ * @alloc:	Allocator to register.
+ *
+ * Adds allocator to the list of allocators managed by CMA.
+ *
+ * All of the fields of cma_allocator structure must be set except for
+ * the optional name and the list's head which will be overriden
+ * anyway.
+ *
+ * Returns zero or negative error code.
+ */
+int cma_allocator_register(struct cma_allocator *alloc);
+
+
+/**************************** Initialisation API ****************************/
+
+/**
+ * cma_set_defaults() - specifies default command line parameters.
+ * @regions:	A zero-sized entry terminated list of early regions.
+ *		This array must not be placed in __initdata section.
+ * @map:	Map attribute.
+ *
+ * This function should be called prior to cma_early_regions_reserve()
+ * and after early parameters have been parsed.
+ *
+ * Returns zero or negative error.
+ */
+int __init cma_set_defaults(struct cma_region *regions, const char *map);
+
+
+/**
+ * cma_early_regions - a list of early regions.
+ *
+ * Platform needs to allocate space for each of the region before
+ * initcalls are executed.  If space is reserved, the reserved flag
+ * must be set.  Platform initialisation code may choose to use
+ * cma_early_regions_allocate().
+ *
+ * Later, during CMA initialisation all reserved regions from the
+ * cma_early_regions list are registered as normal regions and can be
+ * used using standard mechanisms.
+ */
+extern struct list_head cma_early_regions __initdata;
+
+
+/**
+ * cma_early_region_register() - registers an early region.
+ * @reg:	Region to add.
+ *
+ * Region's size, start and alignment must be set (however the last
+ * two can be zero).  If name is set the region will be accessible
+ * using normal mechanism like mapping or cma_alloc_from() function
+ * otherwise it will be a private region accessible only using the
+ * cma_alloc_from_region().
+ *
+ * During platform initialisation, space is reserved for early
+ * regions.  Later, when CMA initialises, the early regions are
+ * "converted" into normal regions.  If cma_region::alloc is set, CMA
+ * will then try to setup given allocator on the region.  Failure to
+ * do so will result in the region not being registered even though
+ * the space for it will still be reserved.  If cma_region::alloc is
+ * not set, allocator will be attached to the region on first use and
+ * the value of cma_region::alloc_name will be taken into account if
+ * set.
+ *
+ * All other fields are ignored and/or overwritten.
+ *
+ * Returns zero or negative error.  No checking if regions overlap is
+ * performed.
+ */
+int __init __must_check cma_early_region_register(struct cma_region *reg);
+
+
+/**
+ * cma_early_region_reserve() - reserves a physically contiguous memory region.
+ * @reg:	Early region to reserve memory for.
+ *
+ * If platform supports bootmem this is the first allocator this
+ * function tries to use.  If that failes (or bootmem is not
+ * supported) function tries to use memblec if it is available.
+ *
+ * On success sets reg->reserved flag.
+ *
+ * Returns zero or negative error.
+ */
+int __init cma_early_region_reserve(struct cma_region *reg);
+
+/**
+ * cma_early_regions_reserve() - helper function for reserving early regions.
+ * @reserve:	Callbac function used to reserve space for region.  Needs
+ *		to return non-negative if allocation succeeded, negative
+ *		error otherwise.  NULL means cma_early_region_alloc() will
+ *		be used.
+ *
+ * This function traverses the %cma_early_regions list and tries to
+ * reserve memory for each early region.  It uses the @reserve
+ * callback function for that purpose.  The reserved flag of each
+ * region is updated accordingly.
+ */
+void __init cma_early_regions_reserve(int (*reserve)(struct cma_region *reg));
+
+#else
+
+#define cma_set_defaults(regions, map)     ((int)0)
+#define cma_early_region_reserve(region)   ((int)-EOPNOTSUPP)
+#define cma_early_regions_reserve(reserve) do { } while (0)
+
+#endif
+
+#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index f4e516e..86043a3 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -301,3 +301,44 @@ config NOMMU_INITIAL_TRIM_EXCESS
 	  of 1 says that all excess pages should be trimmed.
 
 	  See Documentation/nommu-mmap.txt for more information.
+
+
+config CMA
+	bool "Contiguous Memory Allocator framework"
+	# Currently there is only one allocator so force it on
+	select CMA_BEST_FIT
+	help
+	  This enables the Contiguous Memory Allocator framework which
+	  allows drivers to allocate big physically-contiguous blocks of
+	  memory for use with hardware components that do not support I/O
+	  map nor scatter-gather.
+
+	  If you select this option you will also have to select at least
+	  one allocator algorithm below.
+
+	  To make use of CMA you need to specify the regions and
+	  driver->region mapping on command line when booting the kernel.
+
+	  For more information see <Documentation/contiguous-memory.txt>.
+	  If unsure, say "n".
+
+config CMA_DEBUG
+	bool "CMA debug messages (DEVELOPEMENT)"
+	depends on CMA
+	help
+	  Turns on debug messages in CMA.  This produces KERN_DEBUG
+	  messages for every CMA call as well as various messages while
+	  processing calls such as cma_alloc().  This option does not
+	  affect warning and error messages.
+
+	  This is mostly used during development.  If unsure, say "n".
+
+config CMA_BEST_FIT
+	bool "CMA best-fit allocator"
+	depends on CMA
+	help
+	  This is a best-fit algorithm running in O(n log n) time where
+	  n is the number of existing holes (which is never greater then
+	  the number of allocated regions and usually much smaller).  It
+	  allocates area from the smallest hole that is big enough for
+	  allocation in question.
diff --git a/mm/Makefile b/mm/Makefile
index 34b2546..d8c717f 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -47,3 +47,5 @@ obj-$(CONFIG_MEMORY_FAILURE) += memory-failure.o
 obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o
 obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
 obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
+obj-$(CONFIG_CMA) += cma.o
+obj-$(CONFIG_CMA_BEST_FIT) += cma-best-fit.o
diff --git a/mm/cma-best-fit.c b/mm/cma-best-fit.c
new file mode 100644
index 0000000..97f8d61
--- /dev/null
+++ b/mm/cma-best-fit.c
@@ -0,0 +1,407 @@
+/*
+ * Contiguous Memory Allocator framework: Best Fit allocator
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+#define pr_fmt(fmt) "cma: bf: " fmt
+
+#ifdef CONFIG_CMA_DEBUG
+#  define DEBUG
+#endif
+
+#include <linux/errno.h>       /* Error numbers */
+#include <linux/slab.h>        /* kmalloc() */
+
+#include <linux/cma.h>         /* CMA structures */
+
+
+/************************* Data Types *************************/
+
+struct cma_bf_item {
+	struct cma_chunk ch;
+	struct rb_node by_size;
+};
+
+struct cma_bf_private {
+	struct rb_root by_start_root;
+	struct rb_root by_size_root;
+};
+
+
+/************************* Prototypes *************************/
+
+/*
+ * Those are only for holes.  They must be called whenever hole's
+ * properties change but also whenever chunk becomes a hole or hole
+ * becames a chunk.
+ */
+static void __cma_bf_hole_insert_by_size(struct cma_bf_item *item);
+static void __cma_bf_hole_erase_by_size(struct cma_bf_item *item);
+static int  __must_check
+__cma_bf_hole_insert_by_start(struct cma_bf_item *item);
+static void __cma_bf_hole_erase_by_start(struct cma_bf_item *item);
+
+/**
+ * __cma_bf_hole_take - takes a chunk of memory out of a hole.
+ * @hole:	hole to take chunk from
+ * @size:	chunk's size
+ * @alignment:	chunk's starting address alignment (must be power of two)
+ *
+ * Takes a @size bytes large chunk from hole @hole which must be able
+ * to hold the chunk.  The "must be able" includes also alignment
+ * constraint.
+ *
+ * Returns allocated item or NULL on error (if kmalloc() failed).
+ */
+static struct cma_bf_item *__must_check
+__cma_bf_hole_take(struct cma_bf_item *hole, size_t size, dma_addr_t alignment);
+
+/**
+ * __cma_bf_hole_merge_maybe - tries to merge hole with neighbours.
+ * @item: hole to try and merge
+ *
+ * Which items are preserved is undefined so you may not rely on it.
+ */
+static void __cma_bf_hole_merge_maybe(struct cma_bf_item *item);
+
+
+/************************* Device API *************************/
+
+int cma_bf_init(struct cma_region *reg)
+{
+	struct cma_bf_private *prv;
+	struct cma_bf_item *item;
+
+	prv = kzalloc(sizeof *prv, GFP_KERNEL);
+	if (unlikely(!prv))
+		return -ENOMEM;
+
+	item = kzalloc(sizeof *item, GFP_KERNEL);
+	if (unlikely(!item)) {
+		kfree(prv);
+		return -ENOMEM;
+	}
+
+	item->ch.start = reg->start;
+	item->ch.size  = reg->size;
+	item->ch.reg   = reg;
+
+	rb_root_init(&prv->by_start_root, &item->ch.by_start);
+	rb_root_init(&prv->by_size_root, &item->by_size);
+
+	reg->private_data = prv;
+	return 0;
+}
+
+void cma_bf_cleanup(struct cma_region *reg)
+{
+	struct cma_bf_private *prv = reg->private_data;
+	struct cma_bf_item *item =
+		rb_entry(prv->by_size_root.rb_node,
+			 struct cma_bf_item, by_size);
+
+	/* We can assume there is only a single hole in the tree. */
+	WARN_ON(item->by_size.rb_left || item->by_size.rb_right ||
+		item->ch.by_start.rb_left || item->ch.by_start.rb_right);
+
+	kfree(item);
+	kfree(prv);
+}
+
+struct cma_chunk *cma_bf_alloc(struct cma_region *reg,
+			       size_t size, dma_addr_t alignment)
+{
+	struct cma_bf_private *prv = reg->private_data;
+	struct rb_node *node = prv->by_size_root.rb_node;
+	struct cma_bf_item *item = NULL;
+
+	/* First find hole that is large enough */
+	while (node) {
+		struct cma_bf_item *i =
+			rb_entry(node, struct cma_bf_item, by_size);
+
+		if (i->ch.size < size) {
+			node = node->rb_right;
+		} else if (i->ch.size >= size) {
+			node = node->rb_left;
+			item = i;
+		}
+	}
+	if (!item)
+		return NULL;
+
+	/* Now look for items which can satisfy alignment requirements */
+	for (;;) {
+		dma_addr_t start = ALIGN(item->ch.start, alignment);
+		dma_addr_t end   = item->ch.start + item->ch.size;
+		if (start < end && end - start >= size) {
+			item = __cma_bf_hole_take(item, size, alignment);
+			return likely(item) ? &item->ch : NULL;
+		}
+
+		node = rb_next(node);
+		if (!node)
+			return NULL;
+
+		item  = rb_entry(node, struct cma_bf_item, by_size);
+	}
+}
+
+void cma_bf_free(struct cma_chunk *chunk)
+{
+	struct cma_bf_item *item = container_of(chunk, struct cma_bf_item, ch);
+
+	/* Add new hole */
+	if (unlikely(__cma_bf_hole_insert_by_start(item))) {
+		/*
+		 * We're screwed...  Just free the item and forget
+		 * about it.  Things are broken beyond repair so no
+		 * sense in trying to recover.
+		 */
+		kfree(item);
+	} else {
+		__cma_bf_hole_insert_by_size(item);
+
+		/* Merge with prev and next sibling */
+		__cma_bf_hole_merge_maybe(item);
+	}
+}
+
+
+/************************* Basic Tree Manipulation *************************/
+
+static void __cma_bf_hole_insert_by_size(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	struct rb_node **link = &prv->by_size_root.rb_node, *parent = NULL;
+	const typeof(item->ch.size) value = item->ch.size;
+
+	while (*link) {
+		struct cma_bf_item *i;
+		parent = *link;
+		i = rb_entry(parent, struct cma_bf_item, by_size);
+		link = value <= i->ch.size
+			? &parent->rb_left
+			: &parent->rb_right;
+	}
+
+	rb_link_node(&item->by_size, parent, link);
+	rb_insert_color(&item->by_size, &prv->by_size_root);
+}
+
+static void __cma_bf_hole_erase_by_size(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	rb_erase(&item->by_size, &prv->by_size_root);
+}
+
+static int  __must_check
+__cma_bf_hole_insert_by_start(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	struct rb_node **link = &prv->by_start_root.rb_node, *parent = NULL;
+	const typeof(item->ch.start) value = item->ch.start;
+
+	while (*link) {
+		struct cma_bf_item *i;
+		parent = *link;
+		i = rb_entry(parent, struct cma_bf_item, ch.by_start);
+
+		if (WARN_ON(value == i->ch.start))
+			/*
+			 * This should *never* happen.  And I mean
+			 * *never*.  We could even BUG on it but
+			 * hopefully things are only a bit broken,
+			 * ie. system can still run.  We produce
+			 * a warning and return an error.
+			 */
+			return -EBUSY;
+
+		link = value <= i->ch.start
+			? &parent->rb_left
+			: &parent->rb_right;
+	}
+
+	rb_link_node(&item->ch.by_start, parent, link);
+	rb_insert_color(&item->ch.by_start, &prv->by_start_root);
+	return 0;
+}
+
+static void __cma_bf_hole_erase_by_start(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	rb_erase(&item->ch.by_start, &prv->by_start_root);
+}
+
+
+/************************* More Tree Manipulation *************************/
+
+static struct cma_bf_item *__must_check
+__cma_bf_hole_take(struct cma_bf_item *hole, size_t size, size_t alignment)
+{
+	struct cma_bf_item *item;
+
+	/*
+	 * There are three cases:
+	 * 1. the chunk takes the whole hole,
+	 * 2. the chunk is at the beginning or at the end of the hole, or
+	 * 3. the chunk is in the middle of the hole.
+	 */
+
+
+	/* Case 1, the whole hole */
+	if (size == hole->ch.size) {
+		__cma_bf_hole_erase_by_size(hole);
+		__cma_bf_hole_erase_by_start(hole);
+		return hole;
+	}
+
+
+	/* Allocate */
+	item = kmalloc(sizeof *item, GFP_KERNEL);
+	if (unlikely(!item))
+		return NULL;
+
+	item->ch.start = ALIGN(hole->ch.start, alignment);
+	item->ch.size  = size;
+
+	/* Case 3, in the middle */
+	if (item->ch.start != hole->ch.start
+	 && item->ch.start + item->ch.size !=
+	    hole->ch.start + hole->ch.size) {
+		struct cma_bf_item *tail;
+
+		/*
+		 * Space between the end of the chunk and the end of
+		 * the region, ie. space left after the end of the
+		 * chunk.  If this is dividable by alignment we can
+		 * move the chunk to the end of the hole.
+		 */
+		size_t left =
+			hole->ch.start + hole->ch.size -
+			(item->ch.start + item->ch.size);
+		if (left % alignment == 0) {
+			item->ch.start += left;
+			goto case_2;
+		}
+
+		/*
+		 * We are going to add a hole at the end.  This way,
+		 * we will reduce the problem to case 2 -- the chunk
+		 * will be at the end of the hole.
+		 */
+		tail = kmalloc(sizeof *tail, GFP_KERNEL);
+		if (unlikely(!tail)) {
+			kfree(item);
+			return NULL;
+		}
+
+		tail->ch.start = item->ch.start + item->ch.size;
+		tail->ch.size  =
+			hole->ch.start + hole->ch.size - tail->ch.start;
+		tail->ch.reg   = hole->ch.reg;
+
+		if (unlikely(__cma_bf_hole_insert_by_start(tail))) {
+			/*
+			 * Things are broken beyond repair...  Abort
+			 * inserting the hole but still continue with
+			 * allocation (seems like the best we can do).
+			 */
+
+			hole->ch.size = tail->ch.start - hole->ch.start;
+			kfree(tail);
+		} else {
+			__cma_bf_hole_insert_by_size(tail);
+			/*
+			 * It's important that we first insert the new
+			 * hole in the tree sorted by size and later
+			 * reduce the size of the old hole.  We will
+			 * update the position of the old hole in the
+			 * rb tree in code that handles case 2.
+			 */
+			hole->ch.size = tail->ch.start - hole->ch.start;
+		}
+
+		/* Go to case 2 */
+	}
+
+
+	/* Case 2, at the beginning or at the end */
+case_2:
+	/* No need to update the tree; order preserved. */
+	if (item->ch.start == hole->ch.start)
+		hole->ch.start += item->ch.size;
+
+	/* Alter hole's size */
+	hole->ch.size -= size;
+	__cma_bf_hole_erase_by_size(hole);
+	__cma_bf_hole_insert_by_size(hole);
+
+	return item;
+}
+
+
+static void __cma_bf_hole_merge_maybe(struct cma_bf_item *item)
+{
+	struct cma_bf_item *prev;
+	struct rb_node *node;
+	int twice = 2;
+
+	node = rb_prev(&item->ch.by_start);
+	if (unlikely(!node))
+		goto next;
+	prev = rb_entry(node, struct cma_bf_item, ch.by_start);
+
+	for (;;) {
+		if (prev->ch.start + prev->ch.size == item->ch.start) {
+			/* Remove previous hole from trees */
+			__cma_bf_hole_erase_by_size(prev);
+			__cma_bf_hole_erase_by_start(prev);
+
+			/* Alter this hole */
+			item->ch.size += prev->ch.size;
+			item->ch.start = prev->ch.start;
+			__cma_bf_hole_erase_by_size(item);
+			__cma_bf_hole_insert_by_size(item);
+			/*
+			 * No need to update by start trees as we do
+			 * not break sequence order
+			 */
+
+			/* Free prev hole */
+			kfree(prev);
+		}
+
+next:
+		if (!--twice)
+			break;
+
+		node = rb_next(&item->ch.by_start);
+		if (unlikely(!node))
+			break;
+		prev = item;
+		item = rb_entry(node, struct cma_bf_item, ch.by_start);
+	}
+}
+
+
+
+/************************* Register *************************/
+static int cma_bf_module_init(void)
+{
+	static struct cma_allocator alloc = {
+		.name    = "bf",
+		.init    = cma_bf_init,
+		.cleanup = cma_bf_cleanup,
+		.alloc   = cma_bf_alloc,
+		.free    = cma_bf_free,
+	};
+	return cma_allocator_register(&alloc);
+}
+module_init(cma_bf_module_init);
diff --git a/mm/cma.c b/mm/cma.c
new file mode 100644
index 0000000..06d0d5a
--- /dev/null
+++ b/mm/cma.c
@@ -0,0 +1,911 @@
+/*
+ * Contiguous Memory Allocator framework
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/contiguous-memory.txt for details.
+ */
+
+#define pr_fmt(fmt) "cma: " fmt
+
+#ifdef CONFIG_CMA_DEBUG
+#  define DEBUG
+#endif
+
+#ifndef CONFIG_NO_BOOTMEM
+#  include <linux/bootmem.h>   /* alloc_bootmem_pages_nopanic() */
+#endif
+#ifdef CONFIG_HAVE_MEMBLOCK
+#  include <linux/memblock.h>  /* memblock*() */
+#endif
+#include <linux/device.h>      /* struct device, dev_name() */
+#include <linux/errno.h>       /* Error numbers */
+#include <linux/err.h>         /* IS_ERR, PTR_ERR, etc. */
+#include <linux/mm.h>          /* PAGE_ALIGN() */
+#include <linux/module.h>      /* EXPORT_SYMBOL_GPL() */
+#include <linux/mutex.h>       /* mutex */
+#include <linux/slab.h>        /* kmalloc() */
+#include <linux/string.h>      /* str*() */
+
+#include <linux/cma.h>
+
+
+/*
+ * Protects cma_regions, cma_allocators, cma_map, cma_map_length, and
+ * cma_chunks_by_start.
+ */
+static DEFINE_MUTEX(cma_mutex);
+
+
+
+/************************* Map attribute *************************/
+
+static const char *cma_map;
+static size_t cma_map_length;
+
+/*
+ * map-attr      ::= [ rules [ ';' ] ]
+ * rules         ::= rule [ ';' rules ]
+ * rule          ::= patterns '=' regions
+ * patterns      ::= pattern [ ',' patterns ]
+ * regions       ::= REG-NAME [ ',' regions ]
+ * pattern       ::= dev-pattern [ '/' TYPE-NAME ] | '/' TYPE-NAME
+ *
+ * See Documentation/contiguous-memory.txt for details.
+ */
+static ssize_t cma_map_validate(const char *param)
+{
+	const char *ch = param;
+
+	if (*ch == '\0' || *ch == '\n')
+		return 0;
+
+	for (;;) {
+		const char *start = ch;
+
+		while (*ch && *ch != '\n' && *ch != ';' && *ch != '=')
+			++ch;
+
+		if (*ch != '=' || start == ch) {
+			pr_err("map: expecting \"<patterns>=<regions>\" near %s\n",
+			       start);
+			return -EINVAL;
+		}
+
+		while (*++ch != ';')
+			if (*ch == '\0' || *ch == '\n')
+				return ch - param;
+		if (ch[1] == '\0' || ch[1] == '\n')
+			return ch - param;
+		++ch;
+	}
+}
+
+static int __init cma_map_param(char *param)
+{
+	ssize_t len;
+
+	pr_debug("param: map: %s\n", param);
+
+	len = cma_map_validate(param);
+	if (len < 0)
+		return len;
+
+	cma_map = param;
+	cma_map_length = len;
+	return 0;
+}
+
+
+
+/************************* Early regions *************************/
+
+struct list_head cma_early_regions __initdata =
+	LIST_HEAD_INIT(cma_early_regions);
+
+
+int __init __must_check cma_early_region_register(struct cma_region *reg)
+{
+	dma_addr_t start, alignment;
+	size_t size;
+
+	if (reg->alignment & (reg->alignment - 1))
+		return -EINVAL;
+
+	alignment = max(reg->alignment, (dma_addr_t)PAGE_SIZE);
+	start     = ALIGN(reg->start, alignment);
+	size      = PAGE_ALIGN(reg->size);
+
+	if (start + size < start)
+		return -EINVAL;
+
+	reg->size      = size;
+	reg->start     = start;
+	reg->alignment = alignment;
+
+	list_add_tail(&reg->list, &cma_early_regions);
+
+	pr_debug("param: registering early region %s (%p@%p/%p)\n",
+		 reg->name, (void *)reg->size, (void *)reg->start,
+		 (void *)reg->alignment);
+
+	return 0;
+}
+
+
+
+/************************* Regions & Allocators *************************/
+
+static int __cma_region_attach_alloc(struct cma_region *reg);
+
+/* List of all regions.  Named regions are kept before unnamed. */
+static LIST_HEAD(cma_regions);
+
+#define cma_foreach_region(reg) \
+	list_for_each_entry(reg, &cma_regions, list)
+
+int __must_check cma_region_register(struct cma_region *reg)
+{
+	const char *name, *alloc_name;
+	struct cma_region *r;
+	char *ch = NULL;
+	int ret = 0;
+
+	if (!reg->size || reg->start + reg->size < reg->start)
+		return -EINVAL;
+
+	reg->users = 0;
+	reg->used = 0;
+	reg->private_data = NULL;
+	reg->registered = 0;
+	reg->free_space = reg->size;
+
+	/* Copy name and alloc_name */
+	name = reg->name;
+	alloc_name = reg->alloc_name;
+	if (reg->copy_name && (reg->name || reg->alloc_name)) {
+		size_t name_size, alloc_size;
+
+		name_size  = reg->name       ? strlen(reg->name) + 1       : 0;
+		alloc_size = reg->alloc_name ? strlen(reg->alloc_name) + 1 : 0;
+
+		ch = kmalloc(name_size + alloc_size, GFP_KERNEL);
+		if (!ch) {
+			pr_err("%s: not enough memory to allocate name\n",
+			       reg->name ?: "(private)");
+			return -ENOMEM;
+		}
+
+		if (name_size) {
+			memcpy(ch, reg->name, name_size);
+			name = ch;
+			ch += name_size;
+		}
+
+		if (alloc_size) {
+			memcpy(ch, reg->alloc_name, alloc_size);
+			alloc_name = ch;
+		}
+	}
+
+	mutex_lock(&cma_mutex);
+
+	/* Don't let regions overlap */
+	cma_foreach_region(r)
+		if (r->start + r->size > reg->start &&
+		    r->start < reg->start + reg->size) {
+			ret = -EADDRINUSE;
+			goto done;
+		}
+
+	if (reg->alloc) {
+		ret = __cma_region_attach_alloc(reg);
+		if (unlikely(ret < 0))
+			goto done;
+	}
+
+	reg->name = name;
+	reg->alloc_name = alloc_name;
+	reg->registered = 1;
+	ch = NULL;
+
+	/*
+	 * Keep named at the beginning and unnamed (private) at the
+	 * end.  This helps in traversal when named region is looked
+	 * for.
+	 */
+	if (name)
+		list_add(&reg->list, &cma_regions);
+	else
+		list_add_tail(&reg->list, &cma_regions);
+
+done:
+	mutex_unlock(&cma_mutex);
+
+	pr_debug("%s: region %sregistered\n",
+		 reg->name ?: "(private)", ret ? "not " : "");
+	kfree(ch);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(cma_region_register);
+
+static struct cma_region *__must_check
+__cma_region_find(const char **namep)
+{
+	struct cma_region *reg;
+	const char *ch, *name;
+	size_t n;
+
+	ch = *namep;
+	while (*ch && *ch != ',' && *ch != ';')
+		++ch;
+	name = *namep;
+	*namep = *ch == ',' ? ch + 1 : ch;
+	n = ch - name;
+
+	/*
+	 * Named regions are kept in front of unnamed so if we
+	 * encounter unnamed region we can stop.
+	 */
+	cma_foreach_region(reg)
+		if (!reg->name)
+			break;
+		else if (!strncmp(name, reg->name, n) && !reg->name[n])
+			return reg;
+
+	return NULL;
+}
+
+
+/* List of all allocators. */
+static LIST_HEAD(cma_allocators);
+
+#define cma_foreach_allocator(alloc) \
+	list_for_each_entry(alloc, &cma_allocators, list)
+
+int cma_allocator_register(struct cma_allocator *alloc)
+{
+	struct cma_region *reg;
+	int first;
+
+	if (!alloc->alloc || !alloc->free)
+		return -EINVAL;
+
+	mutex_lock(&cma_mutex);
+
+	first = list_empty(&cma_allocators);
+
+	list_add_tail(&alloc->list, &cma_allocators);
+
+	/*
+	 * Attach this allocator to all allocator-less regions that
+	 * request this particular allocator (reg->alloc_name equals
+	 * alloc->name) or if region wants the first available
+	 * allocator and we are the first.
+	 */
+	cma_foreach_region(reg) {
+		if (reg->alloc)
+			continue;
+		if (reg->alloc_name
+		  ? alloc->name && !strcmp(alloc->name, reg->alloc_name)
+		  : (!reg->used && first))
+			continue;
+
+		reg->alloc = alloc;
+		__cma_region_attach_alloc(reg);
+	}
+
+	mutex_unlock(&cma_mutex);
+
+	pr_debug("%s: allocator registered\n", alloc->name ?: "(unnamed)");
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cma_allocator_register);
+
+static struct cma_allocator *__must_check
+__cma_allocator_find(const char *name)
+{
+	struct cma_allocator *alloc;
+
+	if (!name)
+		return list_empty(&cma_allocators)
+			? NULL
+			: list_entry(cma_allocators.next,
+				     struct cma_allocator, list);
+
+	cma_foreach_allocator(alloc)
+		if (alloc->name && !strcmp(name, alloc->name))
+			return alloc;
+
+	return NULL;
+}
+
+
+
+/************************* Initialise CMA *************************/
+
+int __init cma_set_defaults(struct cma_region *regions, const char *map)
+{
+	if (map) {
+		int ret = cma_map_param((char *)map);
+		if (unlikely(ret < 0))
+			return ret;
+	}
+
+	if (!regions)
+		return 0;
+
+	for (; regions->size; ++regions) {
+		int ret = cma_early_region_register(regions);
+		if (unlikely(ret < 0))
+			return ret;
+	}
+
+	return 0;
+}
+
+
+int __init cma_early_region_reserve(struct cma_region *reg)
+{
+	int tried = 0;
+
+	if (!reg->size || (reg->alignment & (reg->alignment - 1)) ||
+	    reg->reserved)
+		return -EINVAL;
+
+#ifndef CONFIG_NO_BOOTMEM
+
+	tried = 1;
+
+	{
+		void *ptr = __alloc_bootmem_nopanic(reg->size, reg->alignment,
+						    reg->start);
+		if (ptr) {
+			reg->start = virt_to_phys(ptr);
+			reg->reserved = 1;
+			return 0;
+		}
+	}
+
+#endif
+
+#ifdef CONFIG_HAVE_MEMBLOCK
+
+	tried = 1;
+
+	if (reg->start) {
+		if (!memblock_is_region_reserved(reg->start, reg->size) &&
+		    memblock_reserve(reg->start, reg->size) >= 0) {
+			reg->reserved = 1;
+			return 0;
+		}
+	} else {
+		/*
+		 * Use __memblock_alloc_base() since
+		 * memblock_alloc_base() panic()s.
+		 */
+		u64 ret = __memblock_alloc_base(reg->size, reg->alignment, 0);
+		if (ret &&
+		    ret < ~(dma_addr_t)0 &&
+		    ret + reg->size < ~(dma_addr_t)0 &&
+		    ret + reg->size > ret) {
+			reg->start = ret;
+			reg->reserved = 1;
+			return 0;
+		}
+
+		if (ret)
+			memblock_free(ret, reg->size);
+	}
+
+#endif
+
+	return tried ? -ENOMEM : -EOPNOTSUPP;
+}
+
+void __init cma_early_regions_reserve(int (*reserve)(struct cma_region *reg))
+{
+	struct cma_region *reg;
+
+	pr_debug("init: reserving early regions\n");
+
+	if (!reserve)
+		reserve = cma_early_region_reserve;
+
+	list_for_each_entry(reg, &cma_early_regions, list) {
+		if (reg->reserved) {
+			/* nothing */
+		} else if (reserve(reg) >= 0) {
+			pr_debug("init: %s: reserved %p@%p\n",
+				 reg->name ?: "(private)",
+				 (void *)reg->size, (void *)reg->start);
+			reg->reserved = 1;
+		} else {
+			pr_warn("init: %s: unable to reserve %p@%p/%p\n",
+				reg->name ?: "(private)",
+				(void *)reg->size, (void *)reg->start,
+				(void *)reg->alignment);
+		}
+	}
+}
+
+
+static int __init cma_init(void)
+{
+	struct cma_region *reg, *n;
+
+	pr_debug("init: initialising\n");
+
+	if (cma_map) {
+		char *val = kmemdup(cma_map, cma_map_length + 1, GFP_KERNEL);
+		cma_map = val;
+		if (!val)
+			return -ENOMEM;
+		val[cma_map_length] = '\0';
+	}
+
+	list_for_each_entry_safe(reg, n, &cma_early_regions, list) {
+		INIT_LIST_HEAD(&reg->list);
+		/*
+		 * We don't care if there was an error.  It's a pity
+		 * but there's not much we can do about it any way.
+		 * If the error is on a region that was parsed from
+		 * command line then it will stay and waste a bit of
+		 * space; if it was registered using
+		 * cma_early_region_register() it's caller's
+		 * responsibility to do something about it.
+		 */
+		if (reg->reserved && cma_region_register(reg) < 0)
+			/* ignore error */;
+	}
+
+	INIT_LIST_HEAD(&cma_early_regions);
+
+	return 0;
+}
+/*
+ * We want to be initialised earlier than module_init/__initcall so
+ * that drivers that want to grab memory at boot time will get CMA
+ * ready.  subsys_initcall() seems early enough and not too early at
+ * the same time.
+ */
+subsys_initcall(cma_init);
+
+
+
+/************************* Chunks *************************/
+
+/* All chunks sorted by start address. */
+static struct rb_root cma_chunks_by_start;
+
+static struct cma_chunk *__must_check __cma_chunk_find(dma_addr_t addr)
+{
+	struct cma_chunk *chunk;
+	struct rb_node *n;
+
+	for (n = cma_chunks_by_start.rb_node; n; ) {
+		chunk = rb_entry(n, struct cma_chunk, by_start);
+		if (addr < chunk->start)
+			n = n->rb_left;
+		else if (addr > chunk->start)
+			n = n->rb_right;
+		else
+			return chunk;
+	}
+	WARN(1, KERN_WARNING "no chunk starting at %p\n", (void *)addr);
+	return NULL;
+}
+
+static int __must_check __cma_chunk_insert(struct cma_chunk *chunk)
+{
+	struct rb_node **new, *parent = NULL;
+	typeof(chunk->start) addr = chunk->start;
+
+	for (new = &cma_chunks_by_start.rb_node; *new; ) {
+		struct cma_chunk *c =
+			container_of(*new, struct cma_chunk, by_start);
+
+		parent = *new;
+		if (addr < c->start) {
+			new = &(*new)->rb_left;
+		} else if (addr > c->start) {
+			new = &(*new)->rb_right;
+		} else {
+			/*
+			 * We should never be here.  If we are it
+			 * means allocator gave us an invalid chunk
+			 * (one that has already been allocated) so we
+			 * refuse to accept it.  Our caller will
+			 * recover by freeing the chunk.
+			 */
+			WARN_ON(1);
+			return -EADDRINUSE;
+		}
+	}
+
+	rb_link_node(&chunk->by_start, parent, new);
+	rb_insert_color(&chunk->by_start, &cma_chunks_by_start);
+
+	return 0;
+}
+
+static void __cma_chunk_free(struct cma_chunk *chunk)
+{
+	rb_erase(&chunk->by_start, &cma_chunks_by_start);
+
+	chunk->reg->alloc->free(chunk);
+	--chunk->reg->users;
+	chunk->reg->free_space += chunk->size;
+}
+
+
+/************************* The Device API *************************/
+
+static const char *__must_check
+__cma_where_from(const struct device *dev, const char *type);
+
+
+/* Allocate. */
+
+static dma_addr_t __must_check
+__cma_alloc_from_region(struct cma_region *reg,
+			size_t size, dma_addr_t alignment)
+{
+	struct cma_chunk *chunk;
+
+	pr_debug("allocate %p/%p from %s\n",
+		 (void *)size, (void *)alignment,
+		 reg ? reg->name ?: "(private)" : "(null)");
+
+	if (!reg || reg->free_space < size)
+		return -ENOMEM;
+
+	if (!reg->alloc) {
+		if (!reg->used)
+			__cma_region_attach_alloc(reg);
+		if (!reg->alloc)
+			return -ENOMEM;
+	}
+
+	chunk = reg->alloc->alloc(reg, size, alignment);
+	if (!chunk)
+		return -ENOMEM;
+
+	if (unlikely(__cma_chunk_insert(chunk) < 0)) {
+		/* We should *never* be here. */
+		chunk->reg->alloc->free(chunk);
+		kfree(chunk);
+		return -EADDRINUSE;
+	}
+
+	chunk->reg = reg;
+	++reg->users;
+	reg->free_space -= chunk->size;
+	pr_debug("allocated at %p\n", (void *)chunk->start);
+	return chunk->start;
+}
+
+dma_addr_t __must_check
+cma_alloc_from_region(struct cma_region *reg,
+		      size_t size, dma_addr_t alignment)
+{
+	dma_addr_t addr;
+
+	pr_debug("allocate %p/%p from %s\n",
+		 (void *)size, (void *)alignment,
+		 reg ? reg->name ?: "(private)" : "(null)");
+
+	if (!size || alignment & (alignment - 1) || !reg)
+		return -EINVAL;
+
+	mutex_lock(&cma_mutex);
+
+	addr = reg->registered ?
+		__cma_alloc_from_region(reg, PAGE_ALIGN(size),
+					max(alignment, (dma_addr_t)PAGE_SIZE)) :
+		-EINVAL;
+
+	mutex_unlock(&cma_mutex);
+
+	return addr;
+}
+EXPORT_SYMBOL_GPL(cma_alloc_from_region);
+
+dma_addr_t __must_check
+__cma_alloc(const struct device *dev, const char *type,
+	    dma_addr_t size, dma_addr_t alignment)
+{
+	struct cma_region *reg;
+	const char *from;
+	dma_addr_t addr;
+
+	if (dev)
+		pr_debug("allocate %p/%p for %s/%s\n",
+			 (void *)size, (void *)alignment,
+			 dev_name(dev), type ?: "");
+
+	if (!size || alignment & (alignment - 1))
+		return -EINVAL;
+
+	size = PAGE_ALIGN(size);
+	if (alignment < PAGE_SIZE)
+		alignment = PAGE_SIZE;
+
+	mutex_lock(&cma_mutex);
+
+	from = __cma_where_from(dev, type);
+	if (unlikely(IS_ERR(from))) {
+		addr = PTR_ERR(from);
+		goto done;
+	}
+
+	pr_debug("allocate %p/%p from one of %s\n",
+		 (void *)size, (void *)alignment, from);
+
+	while (*from && *from != ';') {
+		reg = __cma_region_find(&from);
+		addr = __cma_alloc_from_region(reg, size, alignment);
+		if (!IS_ERR_VALUE(addr))
+			goto done;
+	}
+
+	pr_debug("not enough memory\n");
+	addr = -ENOMEM;
+
+done:
+	mutex_unlock(&cma_mutex);
+
+	return addr;
+}
+EXPORT_SYMBOL_GPL(__cma_alloc);
+
+
+/* Query information about regions. */
+static void __cma_info_add(struct cma_info *infop, struct cma_region *reg)
+{
+	infop->total_size += reg->size;
+	infop->free_size += reg->free_space;
+	if (infop->lower_bound > reg->start)
+		infop->lower_bound = reg->start;
+	if (infop->upper_bound < reg->start + reg->size)
+		infop->upper_bound = reg->start + reg->size;
+	++infop->count;
+}
+
+int
+__cma_info(struct cma_info *infop, const struct device *dev, const char *type)
+{
+	struct cma_info info = { ~(dma_addr_t)0, 0, 0, 0, 0 };
+	struct cma_region *reg;
+	const char *from;
+	int ret;
+
+	if (unlikely(!infop))
+		return -EINVAL;
+
+	mutex_lock(&cma_mutex);
+
+	from = __cma_where_from(dev, type);
+	if (IS_ERR(from)) {
+		ret = PTR_ERR(from);
+		info.lower_bound = 0;
+		goto done;
+	}
+
+	while (*from && *from != ';') {
+		reg = __cma_region_find(&from);
+		if (reg)
+			__cma_info_add(&info, reg);
+	}
+
+	ret = 0;
+done:
+	mutex_unlock(&cma_mutex);
+
+	memcpy(infop, &info, sizeof info);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(__cma_info);
+
+
+/* Freeing. */
+int cma_free(dma_addr_t addr)
+{
+	struct cma_chunk *c;
+	int ret;
+
+	mutex_lock(&cma_mutex);
+
+	c = __cma_chunk_find(addr);
+
+	if (c) {
+		__cma_chunk_free(c);
+		ret = 0;
+	} else {
+		ret = -ENOENT;
+	}
+
+	mutex_unlock(&cma_mutex);
+
+	if (c)
+		pr_debug("free(%p): freed\n", (void *)addr);
+	else
+		pr_err("free(%p): not found\n", (void *)addr);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(cma_free);
+
+
+/************************* Miscellaneous *************************/
+
+static int __cma_region_attach_alloc(struct cma_region *reg)
+{
+	struct cma_allocator *alloc;
+	int ret;
+
+	/*
+	 * If reg->alloc is set then caller wants us to use this
+	 * allocator.  Otherwise we need to find one by name.
+	 */
+	if (reg->alloc) {
+		alloc = reg->alloc;
+	} else {
+		alloc = __cma_allocator_find(reg->alloc_name);
+		if (!alloc) {
+			pr_warn("init: %s: %s: no such allocator\n",
+				reg->name ?: "(private)",
+				reg->alloc_name ?: "(default)");
+			reg->used = 1;
+			return -ENOENT;
+		}
+	}
+
+	/* Try to initialise the allocator. */
+	reg->private_data = NULL;
+	ret = alloc->init ? alloc->init(reg) : 0;
+	if (unlikely(ret < 0)) {
+		pr_err("init: %s: %s: unable to initialise allocator\n",
+		       reg->name ?: "(private)", alloc->name ?: "(unnamed)");
+		reg->alloc = NULL;
+		reg->used = 1;
+	} else {
+		reg->alloc = alloc;
+		pr_debug("init: %s: %s: initialised allocator\n",
+			 reg->name ?: "(private)", alloc->name ?: "(unnamed)");
+	}
+	return ret;
+}
+
+
+/*
+ * s            ::= rules
+ * rules        ::= rule [ ';' rules ]
+ * rule         ::= patterns '=' regions
+ * patterns     ::= pattern [ ',' patterns ]
+ * regions      ::= REG-NAME [ ',' regions ]
+ * pattern      ::= dev-pattern [ '/' TYPE-NAME ] | '/' TYPE-NAME
+ */
+static const char *__must_check
+__cma_where_from(const struct device *dev, const char *type)
+{
+	/*
+	 * This function matches the pattern from the map attribute
+	 * agains given device name and type.  Type may be of course
+	 * NULL or an emtpy string.
+	 */
+
+	const char *s, *name;
+	int name_matched = 0;
+
+	/*
+	 * If dev is NULL we were called in alternative form where
+	 * type is the from string.  All we have to do is return it.
+	 */
+	if (!dev)
+		return type ?: ERR_PTR(-EINVAL);
+
+	if (!cma_map)
+		return ERR_PTR(-ENOENT);
+
+	name = dev_name(dev);
+	if (WARN_ON(!name || !*name))
+		return ERR_PTR(-EINVAL);
+
+	if (!type)
+		type = "common";
+
+	/*
+	 * Now we go throught the cma_map attribute.
+	 */
+	for (s = cma_map; *s; ++s) {
+		const char *c;
+
+		/*
+		 * If the pattern starts with a slash, the device part of the
+		 * pattern matches if it matched previously.
+		 */
+		if (*s == '/') {
+			if (!name_matched)
+				goto look_for_next;
+			goto match_type;
+		}
+
+		/*
+		 * We are now trying to match the device name.  This also
+		 * updates the name_matched variable.  If, while reading the
+		 * spec, we ecnounter comma it means that the pattern does not
+		 * match and we need to start over with another pattern (the
+		 * one afther the comma).  If we encounter equal sign we need
+		 * to start over with another rule.  If there is a character
+		 * that does not match, we neet to look for a comma (to get
+		 * another pattern) or semicolon (to get another rule) and try
+		 * again if there is one somewhere.
+		 */
+
+		name_matched = 0;
+
+		for (c = name; *s != '*' && *c; ++c, ++s)
+			if (*s == '=')
+				goto next_rule;
+			else if (*s == ',')
+				goto next_pattern;
+			else if (*s != '?' && *c != *s)
+				goto look_for_next;
+		if (*s == '*')
+			++s;
+
+		name_matched = 1;
+
+		/*
+		 * Now we need to match the type part of the pattern.  If the
+		 * pattern is missing it we match only if type points to an
+		 * empty string.  Otherwise wy try to match it just like name.
+		 */
+		if (*s == '/') {
+match_type:		/* s points to '/' */
+			++s;
+
+			for (c = type; *s && *c; ++c, ++s)
+				if (*s == '=')
+					goto next_rule;
+				else if (*s == ',')
+					goto next_pattern;
+				else if (*c != *s)
+					goto look_for_next;
+		}
+
+		/* Return the string behind the '=' sign of the rule. */
+		if (*s == '=')
+			return s + 1;
+		else if (*s == ',')
+			return strchr(s, '=') + 1;
+
+		/* Pattern did not match */
+
+look_for_next:
+		do {
+			++s;
+		} while (*s != ',' && *s != '=');
+		if (*s == ',')
+			continue;
+
+next_rule:	/* s points to '=' */
+		s = strchr(s, ';');
+		if (!s)
+			break;
+
+next_pattern:
+		continue;
+	}
+
+	return ERR_PTR(-ENOENT);
+}
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 2/9] mm: cma: Contiguous Memory Allocator added
@ 2010-09-06  6:33   ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel

The Contiguous Memory Allocator framework is a set of APIs for
allocating physically contiguous chunks of memory.

Various chips require contiguous blocks of memory to operate.  Those
chips include devices such as cameras, hardware video decoders and
encoders, etc.

The code is highly modular and customisable to suit the needs of
various users.  Set of regions reserved for CMA can be configured
per-platform and it is easy to add custom allocator algorithms if one
has such need.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Pawel Osciak <p.osciak@samsung.com>
---
 Documentation/00-INDEX              |    2 +
 Documentation/contiguous-memory.txt |  544 +++++++++++++++++++++
 include/linux/cma.h                 |  434 +++++++++++++++++
 mm/Kconfig                          |   41 ++
 mm/Makefile                         |    2 +
 mm/cma-best-fit.c                   |  407 ++++++++++++++++
 mm/cma.c                            |  911 +++++++++++++++++++++++++++++++++++
 7 files changed, 2341 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/contiguous-memory.txt
 create mode 100644 include/linux/cma.h
 create mode 100644 mm/cma-best-fit.c
 create mode 100644 mm/cma.c

diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
index 8dfc670..f93e787 100644
--- a/Documentation/00-INDEX
+++ b/Documentation/00-INDEX
@@ -94,6 +94,8 @@ connector/
 	- docs on the netlink based userspace<->kernel space communication mod.
 console/
 	- documentation on Linux console drivers.
+contiguous-memory.txt
+	- documentation on physically-contiguous memory allocation framework.
 cpu-freq/
 	- info on CPU frequency and voltage scaling.
 cpu-hotplug.txt
diff --git a/Documentation/contiguous-memory.txt b/Documentation/contiguous-memory.txt
new file mode 100644
index 0000000..e470c6f
--- /dev/null
+++ b/Documentation/contiguous-memory.txt
@@ -0,0 +1,544 @@
+                                                             -*- org -*-
+
+* Contiguous Memory Allocator
+
+   The Contiguous Memory Allocator (CMA) is a framework, which allows
+   setting up a machine-specific configuration for physically-contiguous
+   memory management. Memory for devices is then allocated according
+   to that configuration.
+
+   The main role of the framework is not to allocate memory, but to
+   parse and manage memory configurations, as well as to act as an
+   in-between between device drivers and pluggable allocators. It is
+   thus not tied to any memory allocation method or strategy.
+
+** Why is it needed?
+
+    Various devices on embedded systems have no scatter-getter and/or
+    IO map support and as such require contiguous blocks of memory to
+    operate.  They include devices such as cameras, hardware video
+    decoders and encoders, etc.
+
+    Such devices often require big memory buffers (a full HD frame is,
+    for instance, more then 2 mega pixels large, i.e. more than 6 MB
+    of memory), which makes mechanisms such as kmalloc() ineffective.
+
+    Some embedded devices impose additional requirements on the
+    buffers, e.g. they can operate only on buffers allocated in
+    particular location/memory bank (if system has more than one
+    memory bank) or buffers aligned to a particular memory boundary.
+
+    Development of embedded devices have seen a big rise recently
+    (especially in the V4L area) and many such drivers include their
+    own memory allocation code. Most of them use bootmem-based methods.
+    CMA framework is an attempt to unify contiguous memory allocation
+    mechanisms and provide a simple API for device drivers, while
+    staying as customisable and modular as possible.
+
+** Design
+
+    The main design goal for the CMA was to provide a customisable and
+    modular framework, which could be configured to suit the needs of
+    individual systems.  Configuration specifies a list of memory
+    regions, which then are assigned to devices.  Memory regions can
+    be shared among many device drivers or assigned exclusively to
+    one.  This has been achieved in the following ways:
+
+    1. The core of the CMA does not handle allocation of memory and
+       management of free space.  Dedicated allocators are used for
+       that purpose.
+
+       This way, if the provided solution does not match demands
+       imposed on a given system, one can develop a new algorithm and
+       easily plug it into the CMA framework.
+
+       The presented solution includes an implementation of a best-fit
+       algorithm.
+
+    2. When requesting memory, devices have to introduce themselves.
+       This way CMA knows who the memory is allocated for.  This
+       allows for the system architect to specify which memory regions
+       each device should use.
+
+    3. Memory regions are grouped in various "types".  When device
+       requests a chunk of memory, it can specify what type of memory
+       it needs.  If no type is specified, "common" is assumed.
+
+       This makes it possible to configure the system in such a way,
+       that a single device may get memory from different memory
+       regions, depending on the "type" of memory it requested.  For
+       example, a video codec driver might want to allocate some
+       shared buffers from the first memory bank and the other from
+       the second to get the highest possible memory throughput.
+
+    4. For greater flexibility and extensibility, the framework allows
+       device drivers to register private regions of reserved memory
+       which then may be used only by them.
+
+       As an effect, if a driver would not use the rest of the CMA
+       interface, it can still use CMA allocators and other
+       mechanisms.
+
+       4a. Early in boot process, device drivers can also request the
+           CMA framework to a reserve a region of memory for them
+           which then will be used as a private region.
+
+           This way, drivers do not need to directly call bootmem,
+           memblock or similar early allocator but merely register an
+           early region and the framework will handle the rest
+           including choosing the right early allocator.
+
+** Use cases
+
+    Let's analyse some imaginary system that uses the CMA to see how
+    the framework can be used and configured.
+
+
+    We have a platform with a hardware video decoder and a camera each
+    needing 20 MiB of memory in the worst case.  Our system is written
+    in such a way though that the two devices are never used at the
+    same time and memory for them may be shared.  In such a system the
+    following configuration would be used in the platform
+    initialisation code:
+
+        static struct cma_region regions[] = {
+                { .name = "region", .size = 20 << 20 },
+                { }
+        }
+        static const char map[] __initconst = "video,camera=region";
+
+        cma_set_defaults(regions, map);
+
+    The regions array defines a single 20-MiB region named "region".
+    The map says that drivers named "video" and "camera" are to be
+    granted memory from the previously defined region.
+
+    A shorter map can be used as well:
+
+        static const char map[] __initconst = "*=region";
+
+    The asterisk ("*") matches all devices thus all devices will use
+    the region named "region".
+
+    We can see, that because the devices share the same memory region,
+    we save 20 MiB, compared to the situation when each of the devices
+    would reserve 20 MiB of memory for itself.
+
+
+    Now, let's say that we have also many other smaller devices and we
+    want them to share some smaller pool of memory.  For instance 5
+    MiB.  This can be achieved in the following way:
+
+        static struct cma_region regions[] = {
+                { .name = "region", .size = 20 << 20 },
+                { .name = "common", .size =  5 << 20 },
+                { }
+        }
+        static const char map[] __initconst =
+                "video,camera=region;*=common";
+
+        cma_set_defaults(regions, map);
+
+    This instructs CMA to reserve two regions and let video and camera
+    use region "region" whereas all other devices should use region
+    "common".
+
+
+    Later on, after some development of the system, it can now run
+    video decoder and camera at the same time.  The 20 MiB region is
+    no longer enough for the two to share.  A quick fix can be made to
+    grant each of those devices separate regions:
+
+        static struct cma_region regions[] = {
+                { .name = "v", .size = 20 << 20 },
+                { .name = "c", .size = 20 << 20 },
+                { .name = "common", .size =  5 << 20 },
+                { }
+        }
+        static const char map[] __initconst = "video=v;camera=c;*=common";
+
+        cma_set_defaults(regions, map);
+
+    This solution also shows how with CMA you can assign private pools
+    of memory to each device if that is required.
+
+
+    Allocation mechanisms can be replaced dynamically in a similar
+    manner as well. Let's say that during testing, it has been
+    discovered that, for a given shared region of 40 MiB,
+    fragmentation has become a problem.  It has been observed that,
+    after some time, it becomes impossible to allocate buffers of the
+    required sizes. So to satisfy our requirements, we would have to
+    reserve a larger shared region beforehand.
+
+    But fortunately, you have also managed to develop a new allocation
+    algorithm -- Neat Allocation Algorithm or "na" for short -- which
+    satisfies the needs for both devices even on a 30 MiB region.  The
+    configuration can be then quickly changed to:
+
+        static struct cma_region regions[] = {
+                { .name = "region", .size = 30 << 20, .alloc_name = "na" },
+                { .name = "common", .size =  5 << 20 },
+                { }
+        }
+        static const char map[] __initconst = "video,camera=region;*=common";
+
+        cma_set_defaults(regions, map);
+
+    This shows how you can develop your own allocation algorithms if
+    the ones provided with CMA do not suit your needs and easily
+    replace them, without the need to modify CMA core or even
+    recompiling the kernel.
+
+** Technical Details
+
+*** The attributes
+
+    As shown above, CMA is configured by a two attributes: list
+    regions and map.  The first one specifies regions that are to be
+    reserved for CMA.  The second one specifies what regions each
+    device is assigned to.
+
+**** Regions
+
+     Regions is a list of regions terminated by a region with size
+     equal zero.  The following fields may be set:
+
+     - size       -- size of the region (required, must not be zero)
+     - alignment  -- alignment of the region; must be power of two or
+                     zero (optional)
+     - start      -- where the region has to start (optional)
+     - alloc_name -- the name of allocator to use (optional)
+     - alloc      -- allocator to use (optional; and besides
+                     alloc_name is probably is what you want)
+
+     size, alignment and start is specified in bytes.  Size will be
+     aligned up to a PAGE_SIZE.  If alignment is less then a PAGE_SIZE
+     it will be set to a PAGE_SIZE.  start will be aligned to
+     alignment.
+
+**** Map
+
+     The format of the "map" attribute is as follows:
+
+         map-attr      ::= [ rules [ ';' ] ]
+         rules         ::= rule [ ';' rules ]
+         rule          ::= patterns '=' regions
+
+         patterns      ::= pattern [ ',' patterns ]
+
+         regions       ::= REG-NAME [ ',' regions ]
+                       // list of regions to try to allocate memory
+                       // from
+
+         pattern       ::= dev-pattern [ '/' TYPE-NAME ] | '/' TYPE-NAME
+                       // pattern request must match for the rule to
+                       // apply; the first rule that matches is
+                       // applied; if dev-pattern part is omitted
+                       // value identical to the one used in previous
+                       // pattern is assumed.
+
+         dev-pattern   ::= PATTERN
+                       // pattern that device name must match for the
+                       // rule to apply; may contain question marks
+                       // which mach any characters and end with an
+                       // asterisk which match the rest of the string
+                       // (including nothing).
+
+     It is a sequence of rules which specify what regions should given
+     (device, type) pair use.  The first rule that matches is applied.
+
+     For rule to match, the pattern must match (dev, type) pair.
+     Pattern consist of the part before and after slash.  The first
+     part must match device name and the second part must match kind.
+
+     If the first part is empty, the device name is assumed to match
+     iff it matched in previous pattern.  If the second part is
+     omitted it will mach any type of memory requested by device.
+
+     Some examples (whitespace added for better readability):
+
+         cma_map = foo/quaz = r1;
+                       // device foo with type == "quaz" uses region r1
+
+                   foo/* = r2;     // OR:
+                   /* = r2;
+                       // device foo with any other kind uses region r2
+
+                   bar = r1,r2;
+                       // device bar uses region r1 or r2
+
+                   baz?/a , baz?/b = r3;
+                       // devices named baz? where ? is any character
+                       // with type being "a" or "b" use r3
+
+*** The device and types of memory
+
+    The name of the device is taken from the device structure.  It is
+    not possible to use CMA if driver does not register a device
+    (actually this can be overcome if a fake device structure is
+    provided with at least the name set).
+
+    The type of memory is an optional argument provided by the device
+    whenever it requests memory chunk.  In many cases this can be
+    ignored but sometimes it may be required for some devices.
+
+    For instance, let's say that there are two memory banks and for
+    performance reasons a device uses buffers in both of them.
+    Platform defines a memory types "a" and "b" for regions in both
+    banks.  The device driver would use those two types then to
+    request memory chunks from different banks.  CMA attributes could
+    look as follows:
+
+         static struct cma_region regions[] = {
+                 { .name = "a", .size = 32 << 20 },
+                 { .name = "b", .size = 32 << 20, .start = 512 << 20 },
+                 { }
+         }
+         static const char map[] __initconst = "foo/a=a;foo/b=b;*=a,b";
+
+    And whenever the driver allocated the memory it would specify the
+    kind of memory:
+
+        buffer1 = cma_alloc(dev, "a", 1 << 20, 0);
+        buffer2 = cma_alloc(dev, "b", 1 << 20, 0);
+
+    If it was needed to try to allocate from the other bank as well if
+    the dedicated one is full, the map attributes could be changed to:
+
+         static const char map[] __initconst = "foo/a=a,b;foo/b=b,a;*=a,b";
+
+    On the other hand, if the same driver was used on a system with
+    only one bank, the configuration could be changed just to:
+
+         static struct cma_region regions[] = {
+                 { .name = "r", .size = 64 << 20 },
+                 { }
+         }
+         static const char map[] __initconst = "*=r";
+
+    without the need to change the driver at all.
+
+*** Device API
+
+    There are three basic calls provided by the CMA framework to
+    devices.  To allocate a chunk of memory cma_alloc() function needs
+    to be used:
+
+        dma_addr_t cma_alloc(const struct device *dev, const char *type,
+                             size_t size, dma_addr_t alignment);
+
+    If required, device may specify alignment in bytes that the chunk
+    need to satisfy.  It have to be a power of two or zero.  The
+    chunks are always aligned@least to a page.
+
+    The type specifies the type of memory as described to in the
+    previous subsection.  If device driver does not care about memory
+    type it can safely pass NULL as the type which is the same as
+    possing "common".
+
+    The basic usage of the function is just a:
+
+        addr = cma_alloc(dev, NULL, size, 0);
+
+    The function returns bus address of allocated chunk or a value
+    that evaluates to true if checked with IS_ERR_VALUE(), so the
+    correct way for checking for errors is:
+
+        unsigned long addr = cma_alloc(dev, NULL, size, 0);
+        if (IS_ERR_VALUE(addr))
+                /* Error */
+                return (int)addr;
+        /* Allocated */
+
+    (Make sure to include <linux/err.h> which contains the definition
+    of the IS_ERR_VALUE() macro.)
+
+
+    Allocated chunk is freed via a cma_free() function:
+
+        int cma_free(dma_addr_t addr);
+
+    It takes bus address of the chunk as an argument frees it.
+
+
+    The last function is the cma_info() which returns information
+    about regions assigned to given (dev, type) pair.  Its syntax is:
+
+        int cma_info(struct cma_info *info,
+                     const struct device *dev,
+                     const char *type);
+
+    On successful exit it fills the info structure with lower and
+    upper bound of regions, total size and number of regions assigned
+    to given (dev, type) pair.
+
+**** Dynamic and private regions
+
+     In the basic setup, regions are provided and initialised by
+     platform initialisation code (which usually use
+     cma_set_defaults() for that purpose).
+
+     It is, however, possible to create and add regions dynamically
+     using cma_region_register() function.
+
+         int cma_region_register(struct cma_region *reg);
+
+     The region does not have to have name.  If it does not, it won't
+     be accessed via standard mapping (the one provided with map
+     attribute).  Such regions are private and to allocate chunk from
+     them, one needs to call:
+
+         dma_addr_t cma_alloc_from_region(struct cma_region *reg,
+                                          size_t size, dma_addr_t alignment);
+
+     It is just like cma_alloc() expect one specifies what region to
+     allocate memory from.  The region must have been registered.
+
+**** Allocating from region specified by name
+
+     If a driver preferred allocating from a region or list of regions
+     it knows name of it can use a different call simmilar to the
+     previous:
+
+         dma_addr_t cma_alloc_from(const char *regions,
+                                   size_t size, dma_addr_t alignment);
+
+     The first argument is a comma-separated list of regions the
+     driver desires CMA to try and allocate from.  The list is
+     terminated by a NUL byte or a semicolon.
+
+     Similarly, there is a call for requesting information about named
+     regions:
+
+        int cma_info_about(struct cma_info *info, const char *regions);
+
+     Generally, it should not be needed to use those interfaces but
+     they are provided nevertheless.
+
+**** Registering early regions
+
+     An early region is a region that is managed by CMA early during
+     boot process.  It's platforms responsibility to reserve memory
+     for early regions.  Later on, when CMA initialises, early regions
+     with reserved memory are registered as normal regions.
+     Registering an early region may be a way for a device to request
+     a private pool of memory without worrying about actually
+     reserving the memory:
+
+         int cma_early_region_register(struct cma_region *reg);
+
+     This needs to be done quite early on in boot process, before
+     platform traverses the cma_early_regions list to reserve memory.
+
+     When boot process ends, device driver may see whether the region
+     was reserved (by checking reg->reserved flag) and if so, whether
+     it was successfully registered as a normal region (by checking
+     the reg->registered flag).  If that is the case, device driver
+     can use normal API calls to use the region.
+
+*** Allocator operations
+
+    Creating an allocator for CMA needs four functions to be
+    implemented.
+
+
+    The first two are used to initialise an allocator for given driver
+    and clean up afterwards:
+
+        int  cma_foo_init(struct cma_region *reg);
+        void cma_foo_cleanup(struct cma_region *reg);
+
+    The first is called when allocator is attached to region.  When
+    the function is called, the cma_region structure is fully
+    initialised (ie. starting address and size have correct values).
+    As a meter of fact, allocator should never modify the cma_region
+    structure other then the private_data field which it may use to
+    point to it's private data.
+
+    The second call cleans up and frees all resources the allocator
+    has allocated for the region.  The function can assume that all
+    chunks allocated form this region have been freed thus the whole
+    region is free.
+
+
+    The two other calls are used for allocating and freeing chunks.
+    They are:
+
+        struct cma_chunk *cma_foo_alloc(struct cma_region *reg,
+                                        size_t size, dma_addr_t alignment);
+        void cma_foo_free(struct cma_chunk *chunk);
+
+    As names imply the first allocates a chunk and the other frees
+    a chunk of memory.  It also manages a cma_chunk object
+    representing the chunk in physical memory.
+
+    Either of those function can assume that they are the only thread
+    accessing the region.  Therefore, allocator does not need to worry
+    about concurrency.  Moreover, all arguments are guaranteed to be
+    valid (i.e. page aligned size, a power of two alignment no lower
+    the a page size).
+
+
+    When allocator is ready, all that is left is to register it by
+    calling cma_allocator_register() function:
+
+            int cma_allocator_register(struct cma_allocator *alloc);
+
+    The argument is an structure with pointers to the above functions
+    and allocator's name.  The whole call may look something like
+    this:
+
+        static struct cma_allocator alloc = {
+                .name    = "foo",
+                .init    = cma_foo_init,
+                .cleanup = cma_foo_cleanup,
+                .alloc   = cma_foo_alloc,
+                .free    = cma_foo_free,
+        };
+        return cma_allocator_register(&alloc);
+
+    The name ("foo") will be used when a this particular allocator is
+    requested as an allocator for given region.
+
+*** Integration with platform
+
+    There is one function that needs to be called form platform
+    initialisation code.  That is the cma_early_regions_reserve()
+    function:
+
+        void cma_early_regions_reserve(int (*reserve)(struct cma_region *reg));
+
+    It traverses list of all of the early regions provided by platform
+    and registered by drivers and reserves memory for them.  The only
+    argument is a callback function used to reserve the region.
+    Passing NULL as the argument is the same as passing
+    cma_early_region_reserve() function which uses bootmem and
+    memblock for allocating.
+
+    Alternatively, platform code could traverse the cma_early_regions
+    list by itself but this should never be necessary.
+
+
+    Platform has also a way of providing default attributes for CMA,
+    cma_set_defaults() function is used for that purpose:
+
+        int cma_set_defaults(struct cma_region *regions, const char *map)
+
+    It needs to be called prior to reserving regions.  It let one
+    specify the list of regions defined by platform and the map
+    attribute.  The map may point to a string in __initdata.  See
+    above in this document for example usage of this function.
+
+** Future work
+
+    In the future, implementation of mechanisms that would allow the
+    free space inside the regions to be used as page cache, filesystem
+    buffers or swap devices is planned.  With such mechanisms, the
+    memory would not be wasted when not used.
+
+    Because all allocations and freeing of chunks pass the CMA
+    framework it can follow what parts of the reserved memory are
+    freed and what parts are allocated.  Tracking the unused memory
+    would let CMA use it for other purposes such as page cache, I/O
+    buffers, swap, etc.
diff --git a/include/linux/cma.h b/include/linux/cma.h
new file mode 100644
index 0000000..f6f9cb5
--- /dev/null
+++ b/include/linux/cma.h
@@ -0,0 +1,434 @@
+#ifndef __LINUX_CMA_H
+#define __LINUX_CMA_H
+
+/*
+ * Contiguous Memory Allocator framework
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz at samsung.com)
+ */
+
+/*
+ * See Documentation/contiguous-memory.txt for details.
+ */
+
+/***************************** Kernel level API *****************************/
+
+#ifdef __KERNEL__
+
+#include <linux/rbtree.h>
+#include <linux/list.h>
+#include <linux/init.h>
+
+
+struct device;
+struct cma_info;
+
+/*
+ * Don't call it directly, use cma_alloc(), cma_alloc_from() or
+ * cma_alloc_from_region().
+ */
+dma_addr_t __must_check
+__cma_alloc(const struct device *dev, const char *type,
+	    size_t size, dma_addr_t alignment);
+
+/* Don't call it directly, use cma_info() or cma_info_about(). */
+int
+__cma_info(struct cma_info *info, const struct device *dev, const char *type);
+
+
+/**
+ * cma_alloc - allocates contiguous chunk of memory.
+ * @dev:	The device to perform allocation for.
+ * @type:	A type of memory to allocate.  Platform may define
+ *		several different types of memory and device drivers
+ *		can then request chunks of different types.  Usually it's
+ *		safe to pass NULL here which is the same as passing
+ *		"common".
+ * @size:	Size of the memory to allocate in bytes.
+ * @alignment:	Desired alignment in bytes.  Must be a power of two or
+ *		zero.  If alignment is less then a page size it will be
+ *		set to page size. If unsure, pass zero here.
+ *
+ * On error returns a negative error cast to dma_addr_t.  Use
+ * IS_ERR_VALUE() to check if returned value is indeed an error.
+ * Otherwise bus address of the chunk is returned.
+ */
+static inline dma_addr_t __must_check
+cma_alloc(const struct device *dev, const char *type,
+	  size_t size, dma_addr_t alignment)
+{
+	return dev ? __cma_alloc(dev, type, size, alignment) : -EINVAL;
+}
+
+
+/**
+ * struct cma_info - information about regions returned by cma_info().
+ * @lower_bound:	The smallest address that is possible to be
+ *			allocated for given (dev, type) pair.
+ * @upper_bound:	The one byte after the biggest address that is
+ *			possible to be allocated for given (dev, type)
+ *			pair.
+ * @total_size:	Total size of regions mapped to (dev, type) pair.
+ * @free_size:	Total free size in all of the regions mapped to (dev, type)
+ *		pair.  Because of possible race conditions, it is not
+ *		guaranteed that the value will be correct -- it gives only
+ *		an approximation.
+ * @count:	Number of regions mapped to (dev, type) pair.
+ */
+struct cma_info {
+	dma_addr_t lower_bound, upper_bound;
+	size_t total_size, free_size;
+	unsigned count;
+};
+
+/**
+ * cma_info - queries information about regions.
+ * @info:	Pointer to a structure where to save the information.
+ * @dev:	The device to query information for.
+ * @type:	A type of memory to query information for.
+ *		If unsure, pass NULL here which is equal to passing
+ *		"common".
+ *
+ * On error returns a negative error, zero otherwise.
+ */
+static inline int
+cma_info(struct cma_info *info, const struct device *dev, const char *type)
+{
+	return dev ? __cma_info(info, dev, type) : -EINVAL;
+}
+
+
+/**
+ * cma_free - frees a chunk of memory.
+ * @addr:	Beginning of the chunk.
+ *
+ * Returns -ENOENT if there is no chunk at given location; otherwise
+ * zero.  In the former case issues a warning.
+ */
+int cma_free(dma_addr_t addr);
+
+
+
+/****************************** Lower lever API *****************************/
+
+/**
+ * cma_alloc_from - allocates contiguous chunk of memory from named regions.
+ * @regions:	Comma separated list of region names.  Terminated by NUL
+ *		byte or a semicolon.
+ * @size:	Size of the memory to allocate in bytes.
+ * @alignment:	Desired alignment in bytes.  Must be a power of two or
+ *		zero.  If alignment is less then a page size it will be
+ *		set to page size. If unsure, pass zero here.
+ *
+ * On error returns a negative error cast to dma_addr_t.  Use
+ * IS_ERR_VALUE() to check if returned value is indeed an error.
+ * Otherwise bus address of the chunk is returned.
+ */
+static inline dma_addr_t __must_check
+cma_alloc_from(const char *regions, size_t size, dma_addr_t alignment)
+{
+	return __cma_alloc(NULL, regions, size, alignment);
+}
+
+/**
+ * cma_info_about - queries information about named regions.
+ * @info:	Pointer to a structure where to save the information.
+ * @regions:	Comma separated list of region names.  Terminated by NUL
+ *		byte or a semicolon.
+ *
+ * On error returns a negative error, zero otherwise.
+ */
+static inline int
+cma_info_about(struct cma_info *info, const const char *regions)
+{
+	return __cma_info(info, NULL, regions);
+}
+
+
+
+struct cma_allocator;
+
+/**
+ * struct cma_region - a region reserved for CMA allocations.
+ * @name:	Unique name of the region.  Read only.
+ * @start:	Bus address of the region in bytes.  Always aligned at
+ *		least to a full page.  Read only.
+ * @size:	Size of the region in bytes.  Multiply of a page size.
+ *		Read only.
+ * @free_space:	Free space in the region.  Read only.
+ * @alignment:	Desired alignment of the region in bytes.  A power of two,
+ *		always at least page size.  Early.
+ * @alloc:	Allocator used with this region.  NULL means allocator is
+ *		not attached.  Private.
+ * @alloc_name:	Allocator name read from cmdline.  Private.  This may be
+ *		different from @alloc->name.
+ * @private_data:	Allocator's private data.
+ * @users:	Number of chunks allocated in this region.
+ * @list:	Entry in list of regions.  Private.
+ * @used:	Whether region was already used, ie. there was at least
+ *		one allocation request for.  Private.
+ * @registered:	Whether this region has been registered.  Read only.
+ * @reserved:	Whether this region has been reserved.  Early.  Read only.
+ * @copy_name:	Whether @name and @alloc_name needs to be copied when
+ *		this region is converted from early to normal.  Early.
+ *		Private.
+ * @free_alloc_name:	Whether @alloc_name was kmalloced().  Private.
+ *
+ * Regions come in two types: an early region and normal region.  The
+ * former can be reserved or not-reserved.  Fields marked as "early"
+ * are only meaningful in early regions.
+ *
+ * Early regions are important only during initialisation.  The list
+ * of early regions is built from the "cma" command line argument or
+ * platform defaults.  Platform initialisation code is responsible for
+ * reserving space for unreserved regions that are placed on
+ * cma_early_regions list.
+ *
+ * Later, during CMA initialisation all reserved regions from the
+ * cma_early_regions list are registered as normal regions and can be
+ * used using standard mechanisms.
+ */
+struct cma_region {
+	const char *name;
+	dma_addr_t start;
+	size_t size;
+	union {
+		size_t free_space;	/* Normal region */
+		dma_addr_t alignment;	/* Early region */
+	};
+
+	struct cma_allocator *alloc;
+	const char *alloc_name;
+	void *private_data;
+
+	unsigned users;
+	struct list_head list;
+
+	unsigned used:1;
+	unsigned registered:1;
+	unsigned reserved:1;
+	unsigned copy_name:1;
+	unsigned free_alloc_name:1;
+};
+
+
+/**
+ * cma_region_register() - registers a region.
+ * @reg:	Region to region.
+ *
+ * Region's start and size must be set.
+ *
+ * If name is set the region will be accessible using normal mechanism
+ * like mapping or cma_alloc_from() function otherwise it will be
+ * a private region and accessible only using the
+ * cma_alloc_from_region() function.
+ *
+ * If alloc is set function will try to initialise given allocator
+ * (and will return error if it failes).  Otherwise alloc_name may
+ * point to a name of an allocator to use (if not set, the default
+ * will be used).
+ *
+ * All other fields are ignored and/or overwritten.
+ *
+ * Returns zero or negative error.  In particular, -EADDRINUSE if
+ * region overlap with already existing region.
+ */
+int __must_check cma_region_register(struct cma_region *reg);
+
+/**
+ * cma_region_unregister() - unregisters a region.
+ * @reg:	Region to unregister.
+ *
+ * Region is unregistered only if there are no chunks allocated for
+ * it.  Otherwise, function returns -EBUSY.
+ *
+ * On success returs zero.
+ */
+int __must_check cma_region_unregister(struct cma_region *reg);
+
+
+/**
+ * cma_alloc_from_region() - allocates contiguous chunk of memory from region.
+ * @reg:	Region to allocate chunk from.
+ * @size:	Size of the memory to allocate in bytes.
+ * @alignment:	Desired alignment in bytes.  Must be a power of two or
+ *		zero.  If alignment is less then a page size it will be
+ *		set to page size. If unsure, pass zero here.
+ *
+ * On error returns a negative error cast to dma_addr_t.  Use
+ * IS_ERR_VALUE() to check if returned value is indeed an error.
+ * Otherwise bus address of the chunk is returned.
+ */
+dma_addr_t __must_check
+cma_alloc_from_region(struct cma_region *reg,
+		      size_t size, dma_addr_t alignment);
+
+
+
+/****************************** Allocators API ******************************/
+
+/**
+ * struct cma_chunk - an allocated contiguous chunk of memory.
+ * @start:	Bus address in bytes.
+ * @size:	Size in bytes.
+ * @free_space:	Free space in region in bytes.  Read only.
+ * @reg:	Region this chunk belongs to.
+ * @by_start:	A node in an red-black tree with all chunks sorted by
+ *		start address.
+ *
+ * The cma_allocator::alloc() operation need to set only the @start
+ * and @size fields.  The rest is handled by the caller (ie. CMA
+ * glue).
+ */
+struct cma_chunk {
+	dma_addr_t start;
+	size_t size;
+
+	struct cma_region *reg;
+	struct rb_node by_start;
+};
+
+
+/**
+ * struct cma_allocator - a CMA allocator.
+ * @name:	Allocator's unique name
+ * @init:	Initialises an allocator on given region.
+ * @cleanup:	Cleans up after init.  May assume that there are no chunks
+ *		allocated in given region.
+ * @alloc:	Allocates a chunk of memory of given size in bytes and
+ *		with given alignment.  Alignment is a power of
+ *		two (thus non-zero) and callback does not need to check it.
+ *		May also assume that it is the only call that uses given
+ *		region (ie. access to the region is synchronised with
+ *		a mutex).  This has to allocate the chunk object (it may be
+ *		contained in a bigger structure with allocator-specific data.
+ *		Required.
+ * @free:	Frees allocated chunk.  May also assume that it is the only
+ *		call that uses given region.  This has to free() the chunk
+ *		object as well.  Required.
+ * @list:	Entry in list of allocators.  Private.
+ */
+struct cma_allocator {
+	const char *name;
+
+	int (*init)(struct cma_region *reg);
+	void (*cleanup)(struct cma_region *reg);
+	struct cma_chunk *(*alloc)(struct cma_region *reg, size_t size,
+				   dma_addr_t alignment);
+	void (*free)(struct cma_chunk *chunk);
+
+	struct list_head list;
+};
+
+
+/**
+ * cma_allocator_register() - Registers an allocator.
+ * @alloc:	Allocator to register.
+ *
+ * Adds allocator to the list of allocators managed by CMA.
+ *
+ * All of the fields of cma_allocator structure must be set except for
+ * the optional name and the list's head which will be overriden
+ * anyway.
+ *
+ * Returns zero or negative error code.
+ */
+int cma_allocator_register(struct cma_allocator *alloc);
+
+
+/**************************** Initialisation API ****************************/
+
+/**
+ * cma_set_defaults() - specifies default command line parameters.
+ * @regions:	A zero-sized entry terminated list of early regions.
+ *		This array must not be placed in __initdata section.
+ * @map:	Map attribute.
+ *
+ * This function should be called prior to cma_early_regions_reserve()
+ * and after early parameters have been parsed.
+ *
+ * Returns zero or negative error.
+ */
+int __init cma_set_defaults(struct cma_region *regions, const char *map);
+
+
+/**
+ * cma_early_regions - a list of early regions.
+ *
+ * Platform needs to allocate space for each of the region before
+ * initcalls are executed.  If space is reserved, the reserved flag
+ * must be set.  Platform initialisation code may choose to use
+ * cma_early_regions_allocate().
+ *
+ * Later, during CMA initialisation all reserved regions from the
+ * cma_early_regions list are registered as normal regions and can be
+ * used using standard mechanisms.
+ */
+extern struct list_head cma_early_regions __initdata;
+
+
+/**
+ * cma_early_region_register() - registers an early region.
+ * @reg:	Region to add.
+ *
+ * Region's size, start and alignment must be set (however the last
+ * two can be zero).  If name is set the region will be accessible
+ * using normal mechanism like mapping or cma_alloc_from() function
+ * otherwise it will be a private region accessible only using the
+ * cma_alloc_from_region().
+ *
+ * During platform initialisation, space is reserved for early
+ * regions.  Later, when CMA initialises, the early regions are
+ * "converted" into normal regions.  If cma_region::alloc is set, CMA
+ * will then try to setup given allocator on the region.  Failure to
+ * do so will result in the region not being registered even though
+ * the space for it will still be reserved.  If cma_region::alloc is
+ * not set, allocator will be attached to the region on first use and
+ * the value of cma_region::alloc_name will be taken into account if
+ * set.
+ *
+ * All other fields are ignored and/or overwritten.
+ *
+ * Returns zero or negative error.  No checking if regions overlap is
+ * performed.
+ */
+int __init __must_check cma_early_region_register(struct cma_region *reg);
+
+
+/**
+ * cma_early_region_reserve() - reserves a physically contiguous memory region.
+ * @reg:	Early region to reserve memory for.
+ *
+ * If platform supports bootmem this is the first allocator this
+ * function tries to use.  If that failes (or bootmem is not
+ * supported) function tries to use memblec if it is available.
+ *
+ * On success sets reg->reserved flag.
+ *
+ * Returns zero or negative error.
+ */
+int __init cma_early_region_reserve(struct cma_region *reg);
+
+/**
+ * cma_early_regions_reserve() - helper function for reserving early regions.
+ * @reserve:	Callbac function used to reserve space for region.  Needs
+ *		to return non-negative if allocation succeeded, negative
+ *		error otherwise.  NULL means cma_early_region_alloc() will
+ *		be used.
+ *
+ * This function traverses the %cma_early_regions list and tries to
+ * reserve memory for each early region.  It uses the @reserve
+ * callback function for that purpose.  The reserved flag of each
+ * region is updated accordingly.
+ */
+void __init cma_early_regions_reserve(int (*reserve)(struct cma_region *reg));
+
+#else
+
+#define cma_set_defaults(regions, map)     ((int)0)
+#define cma_early_region_reserve(region)   ((int)-EOPNOTSUPP)
+#define cma_early_regions_reserve(reserve) do { } while (0)
+
+#endif
+
+#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index f4e516e..86043a3 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -301,3 +301,44 @@ config NOMMU_INITIAL_TRIM_EXCESS
 	  of 1 says that all excess pages should be trimmed.
 
 	  See Documentation/nommu-mmap.txt for more information.
+
+
+config CMA
+	bool "Contiguous Memory Allocator framework"
+	# Currently there is only one allocator so force it on
+	select CMA_BEST_FIT
+	help
+	  This enables the Contiguous Memory Allocator framework which
+	  allows drivers to allocate big physically-contiguous blocks of
+	  memory for use with hardware components that do not support I/O
+	  map nor scatter-gather.
+
+	  If you select this option you will also have to select at least
+	  one allocator algorithm below.
+
+	  To make use of CMA you need to specify the regions and
+	  driver->region mapping on command line when booting the kernel.
+
+	  For more information see <Documentation/contiguous-memory.txt>.
+	  If unsure, say "n".
+
+config CMA_DEBUG
+	bool "CMA debug messages (DEVELOPEMENT)"
+	depends on CMA
+	help
+	  Turns on debug messages in CMA.  This produces KERN_DEBUG
+	  messages for every CMA call as well as various messages while
+	  processing calls such as cma_alloc().  This option does not
+	  affect warning and error messages.
+
+	  This is mostly used during development.  If unsure, say "n".
+
+config CMA_BEST_FIT
+	bool "CMA best-fit allocator"
+	depends on CMA
+	help
+	  This is a best-fit algorithm running in O(n log n) time where
+	  n is the number of existing holes (which is never greater then
+	  the number of allocated regions and usually much smaller).  It
+	  allocates area from the smallest hole that is big enough for
+	  allocation in question.
diff --git a/mm/Makefile b/mm/Makefile
index 34b2546..d8c717f 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -47,3 +47,5 @@ obj-$(CONFIG_MEMORY_FAILURE) += memory-failure.o
 obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o
 obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
 obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
+obj-$(CONFIG_CMA) += cma.o
+obj-$(CONFIG_CMA_BEST_FIT) += cma-best-fit.o
diff --git a/mm/cma-best-fit.c b/mm/cma-best-fit.c
new file mode 100644
index 0000000..97f8d61
--- /dev/null
+++ b/mm/cma-best-fit.c
@@ -0,0 +1,407 @@
+/*
+ * Contiguous Memory Allocator framework: Best Fit allocator
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz at samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+#define pr_fmt(fmt) "cma: bf: " fmt
+
+#ifdef CONFIG_CMA_DEBUG
+#  define DEBUG
+#endif
+
+#include <linux/errno.h>       /* Error numbers */
+#include <linux/slab.h>        /* kmalloc() */
+
+#include <linux/cma.h>         /* CMA structures */
+
+
+/************************* Data Types *************************/
+
+struct cma_bf_item {
+	struct cma_chunk ch;
+	struct rb_node by_size;
+};
+
+struct cma_bf_private {
+	struct rb_root by_start_root;
+	struct rb_root by_size_root;
+};
+
+
+/************************* Prototypes *************************/
+
+/*
+ * Those are only for holes.  They must be called whenever hole's
+ * properties change but also whenever chunk becomes a hole or hole
+ * becames a chunk.
+ */
+static void __cma_bf_hole_insert_by_size(struct cma_bf_item *item);
+static void __cma_bf_hole_erase_by_size(struct cma_bf_item *item);
+static int  __must_check
+__cma_bf_hole_insert_by_start(struct cma_bf_item *item);
+static void __cma_bf_hole_erase_by_start(struct cma_bf_item *item);
+
+/**
+ * __cma_bf_hole_take - takes a chunk of memory out of a hole.
+ * @hole:	hole to take chunk from
+ * @size:	chunk's size
+ * @alignment:	chunk's starting address alignment (must be power of two)
+ *
+ * Takes a @size bytes large chunk from hole @hole which must be able
+ * to hold the chunk.  The "must be able" includes also alignment
+ * constraint.
+ *
+ * Returns allocated item or NULL on error (if kmalloc() failed).
+ */
+static struct cma_bf_item *__must_check
+__cma_bf_hole_take(struct cma_bf_item *hole, size_t size, dma_addr_t alignment);
+
+/**
+ * __cma_bf_hole_merge_maybe - tries to merge hole with neighbours.
+ * @item: hole to try and merge
+ *
+ * Which items are preserved is undefined so you may not rely on it.
+ */
+static void __cma_bf_hole_merge_maybe(struct cma_bf_item *item);
+
+
+/************************* Device API *************************/
+
+int cma_bf_init(struct cma_region *reg)
+{
+	struct cma_bf_private *prv;
+	struct cma_bf_item *item;
+
+	prv = kzalloc(sizeof *prv, GFP_KERNEL);
+	if (unlikely(!prv))
+		return -ENOMEM;
+
+	item = kzalloc(sizeof *item, GFP_KERNEL);
+	if (unlikely(!item)) {
+		kfree(prv);
+		return -ENOMEM;
+	}
+
+	item->ch.start = reg->start;
+	item->ch.size  = reg->size;
+	item->ch.reg   = reg;
+
+	rb_root_init(&prv->by_start_root, &item->ch.by_start);
+	rb_root_init(&prv->by_size_root, &item->by_size);
+
+	reg->private_data = prv;
+	return 0;
+}
+
+void cma_bf_cleanup(struct cma_region *reg)
+{
+	struct cma_bf_private *prv = reg->private_data;
+	struct cma_bf_item *item =
+		rb_entry(prv->by_size_root.rb_node,
+			 struct cma_bf_item, by_size);
+
+	/* We can assume there is only a single hole in the tree. */
+	WARN_ON(item->by_size.rb_left || item->by_size.rb_right ||
+		item->ch.by_start.rb_left || item->ch.by_start.rb_right);
+
+	kfree(item);
+	kfree(prv);
+}
+
+struct cma_chunk *cma_bf_alloc(struct cma_region *reg,
+			       size_t size, dma_addr_t alignment)
+{
+	struct cma_bf_private *prv = reg->private_data;
+	struct rb_node *node = prv->by_size_root.rb_node;
+	struct cma_bf_item *item = NULL;
+
+	/* First find hole that is large enough */
+	while (node) {
+		struct cma_bf_item *i =
+			rb_entry(node, struct cma_bf_item, by_size);
+
+		if (i->ch.size < size) {
+			node = node->rb_right;
+		} else if (i->ch.size >= size) {
+			node = node->rb_left;
+			item = i;
+		}
+	}
+	if (!item)
+		return NULL;
+
+	/* Now look for items which can satisfy alignment requirements */
+	for (;;) {
+		dma_addr_t start = ALIGN(item->ch.start, alignment);
+		dma_addr_t end   = item->ch.start + item->ch.size;
+		if (start < end && end - start >= size) {
+			item = __cma_bf_hole_take(item, size, alignment);
+			return likely(item) ? &item->ch : NULL;
+		}
+
+		node = rb_next(node);
+		if (!node)
+			return NULL;
+
+		item  = rb_entry(node, struct cma_bf_item, by_size);
+	}
+}
+
+void cma_bf_free(struct cma_chunk *chunk)
+{
+	struct cma_bf_item *item = container_of(chunk, struct cma_bf_item, ch);
+
+	/* Add new hole */
+	if (unlikely(__cma_bf_hole_insert_by_start(item))) {
+		/*
+		 * We're screwed...  Just free the item and forget
+		 * about it.  Things are broken beyond repair so no
+		 * sense in trying to recover.
+		 */
+		kfree(item);
+	} else {
+		__cma_bf_hole_insert_by_size(item);
+
+		/* Merge with prev and next sibling */
+		__cma_bf_hole_merge_maybe(item);
+	}
+}
+
+
+/************************* Basic Tree Manipulation *************************/
+
+static void __cma_bf_hole_insert_by_size(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	struct rb_node **link = &prv->by_size_root.rb_node, *parent = NULL;
+	const typeof(item->ch.size) value = item->ch.size;
+
+	while (*link) {
+		struct cma_bf_item *i;
+		parent = *link;
+		i = rb_entry(parent, struct cma_bf_item, by_size);
+		link = value <= i->ch.size
+			? &parent->rb_left
+			: &parent->rb_right;
+	}
+
+	rb_link_node(&item->by_size, parent, link);
+	rb_insert_color(&item->by_size, &prv->by_size_root);
+}
+
+static void __cma_bf_hole_erase_by_size(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	rb_erase(&item->by_size, &prv->by_size_root);
+}
+
+static int  __must_check
+__cma_bf_hole_insert_by_start(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	struct rb_node **link = &prv->by_start_root.rb_node, *parent = NULL;
+	const typeof(item->ch.start) value = item->ch.start;
+
+	while (*link) {
+		struct cma_bf_item *i;
+		parent = *link;
+		i = rb_entry(parent, struct cma_bf_item, ch.by_start);
+
+		if (WARN_ON(value == i->ch.start))
+			/*
+			 * This should *never* happen.  And I mean
+			 * *never*.  We could even BUG on it but
+			 * hopefully things are only a bit broken,
+			 * ie. system can still run.  We produce
+			 * a warning and return an error.
+			 */
+			return -EBUSY;
+
+		link = value <= i->ch.start
+			? &parent->rb_left
+			: &parent->rb_right;
+	}
+
+	rb_link_node(&item->ch.by_start, parent, link);
+	rb_insert_color(&item->ch.by_start, &prv->by_start_root);
+	return 0;
+}
+
+static void __cma_bf_hole_erase_by_start(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	rb_erase(&item->ch.by_start, &prv->by_start_root);
+}
+
+
+/************************* More Tree Manipulation *************************/
+
+static struct cma_bf_item *__must_check
+__cma_bf_hole_take(struct cma_bf_item *hole, size_t size, size_t alignment)
+{
+	struct cma_bf_item *item;
+
+	/*
+	 * There are three cases:
+	 * 1. the chunk takes the whole hole,
+	 * 2. the chunk is at the beginning or at the end of the hole, or
+	 * 3. the chunk is in the middle of the hole.
+	 */
+
+
+	/* Case 1, the whole hole */
+	if (size == hole->ch.size) {
+		__cma_bf_hole_erase_by_size(hole);
+		__cma_bf_hole_erase_by_start(hole);
+		return hole;
+	}
+
+
+	/* Allocate */
+	item = kmalloc(sizeof *item, GFP_KERNEL);
+	if (unlikely(!item))
+		return NULL;
+
+	item->ch.start = ALIGN(hole->ch.start, alignment);
+	item->ch.size  = size;
+
+	/* Case 3, in the middle */
+	if (item->ch.start != hole->ch.start
+	 && item->ch.start + item->ch.size !=
+	    hole->ch.start + hole->ch.size) {
+		struct cma_bf_item *tail;
+
+		/*
+		 * Space between the end of the chunk and the end of
+		 * the region, ie. space left after the end of the
+		 * chunk.  If this is dividable by alignment we can
+		 * move the chunk to the end of the hole.
+		 */
+		size_t left =
+			hole->ch.start + hole->ch.size -
+			(item->ch.start + item->ch.size);
+		if (left % alignment == 0) {
+			item->ch.start += left;
+			goto case_2;
+		}
+
+		/*
+		 * We are going to add a hole at the end.  This way,
+		 * we will reduce the problem to case 2 -- the chunk
+		 * will be at the end of the hole.
+		 */
+		tail = kmalloc(sizeof *tail, GFP_KERNEL);
+		if (unlikely(!tail)) {
+			kfree(item);
+			return NULL;
+		}
+
+		tail->ch.start = item->ch.start + item->ch.size;
+		tail->ch.size  =
+			hole->ch.start + hole->ch.size - tail->ch.start;
+		tail->ch.reg   = hole->ch.reg;
+
+		if (unlikely(__cma_bf_hole_insert_by_start(tail))) {
+			/*
+			 * Things are broken beyond repair...  Abort
+			 * inserting the hole but still continue with
+			 * allocation (seems like the best we can do).
+			 */
+
+			hole->ch.size = tail->ch.start - hole->ch.start;
+			kfree(tail);
+		} else {
+			__cma_bf_hole_insert_by_size(tail);
+			/*
+			 * It's important that we first insert the new
+			 * hole in the tree sorted by size and later
+			 * reduce the size of the old hole.  We will
+			 * update the position of the old hole in the
+			 * rb tree in code that handles case 2.
+			 */
+			hole->ch.size = tail->ch.start - hole->ch.start;
+		}
+
+		/* Go to case 2 */
+	}
+
+
+	/* Case 2, at the beginning or at the end */
+case_2:
+	/* No need to update the tree; order preserved. */
+	if (item->ch.start == hole->ch.start)
+		hole->ch.start += item->ch.size;
+
+	/* Alter hole's size */
+	hole->ch.size -= size;
+	__cma_bf_hole_erase_by_size(hole);
+	__cma_bf_hole_insert_by_size(hole);
+
+	return item;
+}
+
+
+static void __cma_bf_hole_merge_maybe(struct cma_bf_item *item)
+{
+	struct cma_bf_item *prev;
+	struct rb_node *node;
+	int twice = 2;
+
+	node = rb_prev(&item->ch.by_start);
+	if (unlikely(!node))
+		goto next;
+	prev = rb_entry(node, struct cma_bf_item, ch.by_start);
+
+	for (;;) {
+		if (prev->ch.start + prev->ch.size == item->ch.start) {
+			/* Remove previous hole from trees */
+			__cma_bf_hole_erase_by_size(prev);
+			__cma_bf_hole_erase_by_start(prev);
+
+			/* Alter this hole */
+			item->ch.size += prev->ch.size;
+			item->ch.start = prev->ch.start;
+			__cma_bf_hole_erase_by_size(item);
+			__cma_bf_hole_insert_by_size(item);
+			/*
+			 * No need to update by start trees as we do
+			 * not break sequence order
+			 */
+
+			/* Free prev hole */
+			kfree(prev);
+		}
+
+next:
+		if (!--twice)
+			break;
+
+		node = rb_next(&item->ch.by_start);
+		if (unlikely(!node))
+			break;
+		prev = item;
+		item = rb_entry(node, struct cma_bf_item, ch.by_start);
+	}
+}
+
+
+
+/************************* Register *************************/
+static int cma_bf_module_init(void)
+{
+	static struct cma_allocator alloc = {
+		.name    = "bf",
+		.init    = cma_bf_init,
+		.cleanup = cma_bf_cleanup,
+		.alloc   = cma_bf_alloc,
+		.free    = cma_bf_free,
+	};
+	return cma_allocator_register(&alloc);
+}
+module_init(cma_bf_module_init);
diff --git a/mm/cma.c b/mm/cma.c
new file mode 100644
index 0000000..06d0d5a
--- /dev/null
+++ b/mm/cma.c
@@ -0,0 +1,911 @@
+/*
+ * Contiguous Memory Allocator framework
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz at samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/contiguous-memory.txt for details.
+ */
+
+#define pr_fmt(fmt) "cma: " fmt
+
+#ifdef CONFIG_CMA_DEBUG
+#  define DEBUG
+#endif
+
+#ifndef CONFIG_NO_BOOTMEM
+#  include <linux/bootmem.h>   /* alloc_bootmem_pages_nopanic() */
+#endif
+#ifdef CONFIG_HAVE_MEMBLOCK
+#  include <linux/memblock.h>  /* memblock*() */
+#endif
+#include <linux/device.h>      /* struct device, dev_name() */
+#include <linux/errno.h>       /* Error numbers */
+#include <linux/err.h>         /* IS_ERR, PTR_ERR, etc. */
+#include <linux/mm.h>          /* PAGE_ALIGN() */
+#include <linux/module.h>      /* EXPORT_SYMBOL_GPL() */
+#include <linux/mutex.h>       /* mutex */
+#include <linux/slab.h>        /* kmalloc() */
+#include <linux/string.h>      /* str*() */
+
+#include <linux/cma.h>
+
+
+/*
+ * Protects cma_regions, cma_allocators, cma_map, cma_map_length, and
+ * cma_chunks_by_start.
+ */
+static DEFINE_MUTEX(cma_mutex);
+
+
+
+/************************* Map attribute *************************/
+
+static const char *cma_map;
+static size_t cma_map_length;
+
+/*
+ * map-attr      ::= [ rules [ ';' ] ]
+ * rules         ::= rule [ ';' rules ]
+ * rule          ::= patterns '=' regions
+ * patterns      ::= pattern [ ',' patterns ]
+ * regions       ::= REG-NAME [ ',' regions ]
+ * pattern       ::= dev-pattern [ '/' TYPE-NAME ] | '/' TYPE-NAME
+ *
+ * See Documentation/contiguous-memory.txt for details.
+ */
+static ssize_t cma_map_validate(const char *param)
+{
+	const char *ch = param;
+
+	if (*ch == '\0' || *ch == '\n')
+		return 0;
+
+	for (;;) {
+		const char *start = ch;
+
+		while (*ch && *ch != '\n' && *ch != ';' && *ch != '=')
+			++ch;
+
+		if (*ch != '=' || start == ch) {
+			pr_err("map: expecting \"<patterns>=<regions>\" near %s\n",
+			       start);
+			return -EINVAL;
+		}
+
+		while (*++ch != ';')
+			if (*ch == '\0' || *ch == '\n')
+				return ch - param;
+		if (ch[1] == '\0' || ch[1] == '\n')
+			return ch - param;
+		++ch;
+	}
+}
+
+static int __init cma_map_param(char *param)
+{
+	ssize_t len;
+
+	pr_debug("param: map: %s\n", param);
+
+	len = cma_map_validate(param);
+	if (len < 0)
+		return len;
+
+	cma_map = param;
+	cma_map_length = len;
+	return 0;
+}
+
+
+
+/************************* Early regions *************************/
+
+struct list_head cma_early_regions __initdata =
+	LIST_HEAD_INIT(cma_early_regions);
+
+
+int __init __must_check cma_early_region_register(struct cma_region *reg)
+{
+	dma_addr_t start, alignment;
+	size_t size;
+
+	if (reg->alignment & (reg->alignment - 1))
+		return -EINVAL;
+
+	alignment = max(reg->alignment, (dma_addr_t)PAGE_SIZE);
+	start     = ALIGN(reg->start, alignment);
+	size      = PAGE_ALIGN(reg->size);
+
+	if (start + size < start)
+		return -EINVAL;
+
+	reg->size      = size;
+	reg->start     = start;
+	reg->alignment = alignment;
+
+	list_add_tail(&reg->list, &cma_early_regions);
+
+	pr_debug("param: registering early region %s (%p@%p/%p)\n",
+		 reg->name, (void *)reg->size, (void *)reg->start,
+		 (void *)reg->alignment);
+
+	return 0;
+}
+
+
+
+/************************* Regions & Allocators *************************/
+
+static int __cma_region_attach_alloc(struct cma_region *reg);
+
+/* List of all regions.  Named regions are kept before unnamed. */
+static LIST_HEAD(cma_regions);
+
+#define cma_foreach_region(reg) \
+	list_for_each_entry(reg, &cma_regions, list)
+
+int __must_check cma_region_register(struct cma_region *reg)
+{
+	const char *name, *alloc_name;
+	struct cma_region *r;
+	char *ch = NULL;
+	int ret = 0;
+
+	if (!reg->size || reg->start + reg->size < reg->start)
+		return -EINVAL;
+
+	reg->users = 0;
+	reg->used = 0;
+	reg->private_data = NULL;
+	reg->registered = 0;
+	reg->free_space = reg->size;
+
+	/* Copy name and alloc_name */
+	name = reg->name;
+	alloc_name = reg->alloc_name;
+	if (reg->copy_name && (reg->name || reg->alloc_name)) {
+		size_t name_size, alloc_size;
+
+		name_size  = reg->name       ? strlen(reg->name) + 1       : 0;
+		alloc_size = reg->alloc_name ? strlen(reg->alloc_name) + 1 : 0;
+
+		ch = kmalloc(name_size + alloc_size, GFP_KERNEL);
+		if (!ch) {
+			pr_err("%s: not enough memory to allocate name\n",
+			       reg->name ?: "(private)");
+			return -ENOMEM;
+		}
+
+		if (name_size) {
+			memcpy(ch, reg->name, name_size);
+			name = ch;
+			ch += name_size;
+		}
+
+		if (alloc_size) {
+			memcpy(ch, reg->alloc_name, alloc_size);
+			alloc_name = ch;
+		}
+	}
+
+	mutex_lock(&cma_mutex);
+
+	/* Don't let regions overlap */
+	cma_foreach_region(r)
+		if (r->start + r->size > reg->start &&
+		    r->start < reg->start + reg->size) {
+			ret = -EADDRINUSE;
+			goto done;
+		}
+
+	if (reg->alloc) {
+		ret = __cma_region_attach_alloc(reg);
+		if (unlikely(ret < 0))
+			goto done;
+	}
+
+	reg->name = name;
+	reg->alloc_name = alloc_name;
+	reg->registered = 1;
+	ch = NULL;
+
+	/*
+	 * Keep named at the beginning and unnamed (private) at the
+	 * end.  This helps in traversal when named region is looked
+	 * for.
+	 */
+	if (name)
+		list_add(&reg->list, &cma_regions);
+	else
+		list_add_tail(&reg->list, &cma_regions);
+
+done:
+	mutex_unlock(&cma_mutex);
+
+	pr_debug("%s: region %sregistered\n",
+		 reg->name ?: "(private)", ret ? "not " : "");
+	kfree(ch);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(cma_region_register);
+
+static struct cma_region *__must_check
+__cma_region_find(const char **namep)
+{
+	struct cma_region *reg;
+	const char *ch, *name;
+	size_t n;
+
+	ch = *namep;
+	while (*ch && *ch != ',' && *ch != ';')
+		++ch;
+	name = *namep;
+	*namep = *ch == ',' ? ch + 1 : ch;
+	n = ch - name;
+
+	/*
+	 * Named regions are kept in front of unnamed so if we
+	 * encounter unnamed region we can stop.
+	 */
+	cma_foreach_region(reg)
+		if (!reg->name)
+			break;
+		else if (!strncmp(name, reg->name, n) && !reg->name[n])
+			return reg;
+
+	return NULL;
+}
+
+
+/* List of all allocators. */
+static LIST_HEAD(cma_allocators);
+
+#define cma_foreach_allocator(alloc) \
+	list_for_each_entry(alloc, &cma_allocators, list)
+
+int cma_allocator_register(struct cma_allocator *alloc)
+{
+	struct cma_region *reg;
+	int first;
+
+	if (!alloc->alloc || !alloc->free)
+		return -EINVAL;
+
+	mutex_lock(&cma_mutex);
+
+	first = list_empty(&cma_allocators);
+
+	list_add_tail(&alloc->list, &cma_allocators);
+
+	/*
+	 * Attach this allocator to all allocator-less regions that
+	 * request this particular allocator (reg->alloc_name equals
+	 * alloc->name) or if region wants the first available
+	 * allocator and we are the first.
+	 */
+	cma_foreach_region(reg) {
+		if (reg->alloc)
+			continue;
+		if (reg->alloc_name
+		  ? alloc->name && !strcmp(alloc->name, reg->alloc_name)
+		  : (!reg->used && first))
+			continue;
+
+		reg->alloc = alloc;
+		__cma_region_attach_alloc(reg);
+	}
+
+	mutex_unlock(&cma_mutex);
+
+	pr_debug("%s: allocator registered\n", alloc->name ?: "(unnamed)");
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cma_allocator_register);
+
+static struct cma_allocator *__must_check
+__cma_allocator_find(const char *name)
+{
+	struct cma_allocator *alloc;
+
+	if (!name)
+		return list_empty(&cma_allocators)
+			? NULL
+			: list_entry(cma_allocators.next,
+				     struct cma_allocator, list);
+
+	cma_foreach_allocator(alloc)
+		if (alloc->name && !strcmp(name, alloc->name))
+			return alloc;
+
+	return NULL;
+}
+
+
+
+/************************* Initialise CMA *************************/
+
+int __init cma_set_defaults(struct cma_region *regions, const char *map)
+{
+	if (map) {
+		int ret = cma_map_param((char *)map);
+		if (unlikely(ret < 0))
+			return ret;
+	}
+
+	if (!regions)
+		return 0;
+
+	for (; regions->size; ++regions) {
+		int ret = cma_early_region_register(regions);
+		if (unlikely(ret < 0))
+			return ret;
+	}
+
+	return 0;
+}
+
+
+int __init cma_early_region_reserve(struct cma_region *reg)
+{
+	int tried = 0;
+
+	if (!reg->size || (reg->alignment & (reg->alignment - 1)) ||
+	    reg->reserved)
+		return -EINVAL;
+
+#ifndef CONFIG_NO_BOOTMEM
+
+	tried = 1;
+
+	{
+		void *ptr = __alloc_bootmem_nopanic(reg->size, reg->alignment,
+						    reg->start);
+		if (ptr) {
+			reg->start = virt_to_phys(ptr);
+			reg->reserved = 1;
+			return 0;
+		}
+	}
+
+#endif
+
+#ifdef CONFIG_HAVE_MEMBLOCK
+
+	tried = 1;
+
+	if (reg->start) {
+		if (!memblock_is_region_reserved(reg->start, reg->size) &&
+		    memblock_reserve(reg->start, reg->size) >= 0) {
+			reg->reserved = 1;
+			return 0;
+		}
+	} else {
+		/*
+		 * Use __memblock_alloc_base() since
+		 * memblock_alloc_base() panic()s.
+		 */
+		u64 ret = __memblock_alloc_base(reg->size, reg->alignment, 0);
+		if (ret &&
+		    ret < ~(dma_addr_t)0 &&
+		    ret + reg->size < ~(dma_addr_t)0 &&
+		    ret + reg->size > ret) {
+			reg->start = ret;
+			reg->reserved = 1;
+			return 0;
+		}
+
+		if (ret)
+			memblock_free(ret, reg->size);
+	}
+
+#endif
+
+	return tried ? -ENOMEM : -EOPNOTSUPP;
+}
+
+void __init cma_early_regions_reserve(int (*reserve)(struct cma_region *reg))
+{
+	struct cma_region *reg;
+
+	pr_debug("init: reserving early regions\n");
+
+	if (!reserve)
+		reserve = cma_early_region_reserve;
+
+	list_for_each_entry(reg, &cma_early_regions, list) {
+		if (reg->reserved) {
+			/* nothing */
+		} else if (reserve(reg) >= 0) {
+			pr_debug("init: %s: reserved %p@%p\n",
+				 reg->name ?: "(private)",
+				 (void *)reg->size, (void *)reg->start);
+			reg->reserved = 1;
+		} else {
+			pr_warn("init: %s: unable to reserve %p@%p/%p\n",
+				reg->name ?: "(private)",
+				(void *)reg->size, (void *)reg->start,
+				(void *)reg->alignment);
+		}
+	}
+}
+
+
+static int __init cma_init(void)
+{
+	struct cma_region *reg, *n;
+
+	pr_debug("init: initialising\n");
+
+	if (cma_map) {
+		char *val = kmemdup(cma_map, cma_map_length + 1, GFP_KERNEL);
+		cma_map = val;
+		if (!val)
+			return -ENOMEM;
+		val[cma_map_length] = '\0';
+	}
+
+	list_for_each_entry_safe(reg, n, &cma_early_regions, list) {
+		INIT_LIST_HEAD(&reg->list);
+		/*
+		 * We don't care if there was an error.  It's a pity
+		 * but there's not much we can do about it any way.
+		 * If the error is on a region that was parsed from
+		 * command line then it will stay and waste a bit of
+		 * space; if it was registered using
+		 * cma_early_region_register() it's caller's
+		 * responsibility to do something about it.
+		 */
+		if (reg->reserved && cma_region_register(reg) < 0)
+			/* ignore error */;
+	}
+
+	INIT_LIST_HEAD(&cma_early_regions);
+
+	return 0;
+}
+/*
+ * We want to be initialised earlier than module_init/__initcall so
+ * that drivers that want to grab memory@boot time will get CMA
+ * ready.  subsys_initcall() seems early enough and not too early at
+ * the same time.
+ */
+subsys_initcall(cma_init);
+
+
+
+/************************* Chunks *************************/
+
+/* All chunks sorted by start address. */
+static struct rb_root cma_chunks_by_start;
+
+static struct cma_chunk *__must_check __cma_chunk_find(dma_addr_t addr)
+{
+	struct cma_chunk *chunk;
+	struct rb_node *n;
+
+	for (n = cma_chunks_by_start.rb_node; n; ) {
+		chunk = rb_entry(n, struct cma_chunk, by_start);
+		if (addr < chunk->start)
+			n = n->rb_left;
+		else if (addr > chunk->start)
+			n = n->rb_right;
+		else
+			return chunk;
+	}
+	WARN(1, KERN_WARNING "no chunk starting at %p\n", (void *)addr);
+	return NULL;
+}
+
+static int __must_check __cma_chunk_insert(struct cma_chunk *chunk)
+{
+	struct rb_node **new, *parent = NULL;
+	typeof(chunk->start) addr = chunk->start;
+
+	for (new = &cma_chunks_by_start.rb_node; *new; ) {
+		struct cma_chunk *c =
+			container_of(*new, struct cma_chunk, by_start);
+
+		parent = *new;
+		if (addr < c->start) {
+			new = &(*new)->rb_left;
+		} else if (addr > c->start) {
+			new = &(*new)->rb_right;
+		} else {
+			/*
+			 * We should never be here.  If we are it
+			 * means allocator gave us an invalid chunk
+			 * (one that has already been allocated) so we
+			 * refuse to accept it.  Our caller will
+			 * recover by freeing the chunk.
+			 */
+			WARN_ON(1);
+			return -EADDRINUSE;
+		}
+	}
+
+	rb_link_node(&chunk->by_start, parent, new);
+	rb_insert_color(&chunk->by_start, &cma_chunks_by_start);
+
+	return 0;
+}
+
+static void __cma_chunk_free(struct cma_chunk *chunk)
+{
+	rb_erase(&chunk->by_start, &cma_chunks_by_start);
+
+	chunk->reg->alloc->free(chunk);
+	--chunk->reg->users;
+	chunk->reg->free_space += chunk->size;
+}
+
+
+/************************* The Device API *************************/
+
+static const char *__must_check
+__cma_where_from(const struct device *dev, const char *type);
+
+
+/* Allocate. */
+
+static dma_addr_t __must_check
+__cma_alloc_from_region(struct cma_region *reg,
+			size_t size, dma_addr_t alignment)
+{
+	struct cma_chunk *chunk;
+
+	pr_debug("allocate %p/%p from %s\n",
+		 (void *)size, (void *)alignment,
+		 reg ? reg->name ?: "(private)" : "(null)");
+
+	if (!reg || reg->free_space < size)
+		return -ENOMEM;
+
+	if (!reg->alloc) {
+		if (!reg->used)
+			__cma_region_attach_alloc(reg);
+		if (!reg->alloc)
+			return -ENOMEM;
+	}
+
+	chunk = reg->alloc->alloc(reg, size, alignment);
+	if (!chunk)
+		return -ENOMEM;
+
+	if (unlikely(__cma_chunk_insert(chunk) < 0)) {
+		/* We should *never* be here. */
+		chunk->reg->alloc->free(chunk);
+		kfree(chunk);
+		return -EADDRINUSE;
+	}
+
+	chunk->reg = reg;
+	++reg->users;
+	reg->free_space -= chunk->size;
+	pr_debug("allocated at %p\n", (void *)chunk->start);
+	return chunk->start;
+}
+
+dma_addr_t __must_check
+cma_alloc_from_region(struct cma_region *reg,
+		      size_t size, dma_addr_t alignment)
+{
+	dma_addr_t addr;
+
+	pr_debug("allocate %p/%p from %s\n",
+		 (void *)size, (void *)alignment,
+		 reg ? reg->name ?: "(private)" : "(null)");
+
+	if (!size || alignment & (alignment - 1) || !reg)
+		return -EINVAL;
+
+	mutex_lock(&cma_mutex);
+
+	addr = reg->registered ?
+		__cma_alloc_from_region(reg, PAGE_ALIGN(size),
+					max(alignment, (dma_addr_t)PAGE_SIZE)) :
+		-EINVAL;
+
+	mutex_unlock(&cma_mutex);
+
+	return addr;
+}
+EXPORT_SYMBOL_GPL(cma_alloc_from_region);
+
+dma_addr_t __must_check
+__cma_alloc(const struct device *dev, const char *type,
+	    dma_addr_t size, dma_addr_t alignment)
+{
+	struct cma_region *reg;
+	const char *from;
+	dma_addr_t addr;
+
+	if (dev)
+		pr_debug("allocate %p/%p for %s/%s\n",
+			 (void *)size, (void *)alignment,
+			 dev_name(dev), type ?: "");
+
+	if (!size || alignment & (alignment - 1))
+		return -EINVAL;
+
+	size = PAGE_ALIGN(size);
+	if (alignment < PAGE_SIZE)
+		alignment = PAGE_SIZE;
+
+	mutex_lock(&cma_mutex);
+
+	from = __cma_where_from(dev, type);
+	if (unlikely(IS_ERR(from))) {
+		addr = PTR_ERR(from);
+		goto done;
+	}
+
+	pr_debug("allocate %p/%p from one of %s\n",
+		 (void *)size, (void *)alignment, from);
+
+	while (*from && *from != ';') {
+		reg = __cma_region_find(&from);
+		addr = __cma_alloc_from_region(reg, size, alignment);
+		if (!IS_ERR_VALUE(addr))
+			goto done;
+	}
+
+	pr_debug("not enough memory\n");
+	addr = -ENOMEM;
+
+done:
+	mutex_unlock(&cma_mutex);
+
+	return addr;
+}
+EXPORT_SYMBOL_GPL(__cma_alloc);
+
+
+/* Query information about regions. */
+static void __cma_info_add(struct cma_info *infop, struct cma_region *reg)
+{
+	infop->total_size += reg->size;
+	infop->free_size += reg->free_space;
+	if (infop->lower_bound > reg->start)
+		infop->lower_bound = reg->start;
+	if (infop->upper_bound < reg->start + reg->size)
+		infop->upper_bound = reg->start + reg->size;
+	++infop->count;
+}
+
+int
+__cma_info(struct cma_info *infop, const struct device *dev, const char *type)
+{
+	struct cma_info info = { ~(dma_addr_t)0, 0, 0, 0, 0 };
+	struct cma_region *reg;
+	const char *from;
+	int ret;
+
+	if (unlikely(!infop))
+		return -EINVAL;
+
+	mutex_lock(&cma_mutex);
+
+	from = __cma_where_from(dev, type);
+	if (IS_ERR(from)) {
+		ret = PTR_ERR(from);
+		info.lower_bound = 0;
+		goto done;
+	}
+
+	while (*from && *from != ';') {
+		reg = __cma_region_find(&from);
+		if (reg)
+			__cma_info_add(&info, reg);
+	}
+
+	ret = 0;
+done:
+	mutex_unlock(&cma_mutex);
+
+	memcpy(infop, &info, sizeof info);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(__cma_info);
+
+
+/* Freeing. */
+int cma_free(dma_addr_t addr)
+{
+	struct cma_chunk *c;
+	int ret;
+
+	mutex_lock(&cma_mutex);
+
+	c = __cma_chunk_find(addr);
+
+	if (c) {
+		__cma_chunk_free(c);
+		ret = 0;
+	} else {
+		ret = -ENOENT;
+	}
+
+	mutex_unlock(&cma_mutex);
+
+	if (c)
+		pr_debug("free(%p): freed\n", (void *)addr);
+	else
+		pr_err("free(%p): not found\n", (void *)addr);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(cma_free);
+
+
+/************************* Miscellaneous *************************/
+
+static int __cma_region_attach_alloc(struct cma_region *reg)
+{
+	struct cma_allocator *alloc;
+	int ret;
+
+	/*
+	 * If reg->alloc is set then caller wants us to use this
+	 * allocator.  Otherwise we need to find one by name.
+	 */
+	if (reg->alloc) {
+		alloc = reg->alloc;
+	} else {
+		alloc = __cma_allocator_find(reg->alloc_name);
+		if (!alloc) {
+			pr_warn("init: %s: %s: no such allocator\n",
+				reg->name ?: "(private)",
+				reg->alloc_name ?: "(default)");
+			reg->used = 1;
+			return -ENOENT;
+		}
+	}
+
+	/* Try to initialise the allocator. */
+	reg->private_data = NULL;
+	ret = alloc->init ? alloc->init(reg) : 0;
+	if (unlikely(ret < 0)) {
+		pr_err("init: %s: %s: unable to initialise allocator\n",
+		       reg->name ?: "(private)", alloc->name ?: "(unnamed)");
+		reg->alloc = NULL;
+		reg->used = 1;
+	} else {
+		reg->alloc = alloc;
+		pr_debug("init: %s: %s: initialised allocator\n",
+			 reg->name ?: "(private)", alloc->name ?: "(unnamed)");
+	}
+	return ret;
+}
+
+
+/*
+ * s            ::= rules
+ * rules        ::= rule [ ';' rules ]
+ * rule         ::= patterns '=' regions
+ * patterns     ::= pattern [ ',' patterns ]
+ * regions      ::= REG-NAME [ ',' regions ]
+ * pattern      ::= dev-pattern [ '/' TYPE-NAME ] | '/' TYPE-NAME
+ */
+static const char *__must_check
+__cma_where_from(const struct device *dev, const char *type)
+{
+	/*
+	 * This function matches the pattern from the map attribute
+	 * agains given device name and type.  Type may be of course
+	 * NULL or an emtpy string.
+	 */
+
+	const char *s, *name;
+	int name_matched = 0;
+
+	/*
+	 * If dev is NULL we were called in alternative form where
+	 * type is the from string.  All we have to do is return it.
+	 */
+	if (!dev)
+		return type ?: ERR_PTR(-EINVAL);
+
+	if (!cma_map)
+		return ERR_PTR(-ENOENT);
+
+	name = dev_name(dev);
+	if (WARN_ON(!name || !*name))
+		return ERR_PTR(-EINVAL);
+
+	if (!type)
+		type = "common";
+
+	/*
+	 * Now we go throught the cma_map attribute.
+	 */
+	for (s = cma_map; *s; ++s) {
+		const char *c;
+
+		/*
+		 * If the pattern starts with a slash, the device part of the
+		 * pattern matches if it matched previously.
+		 */
+		if (*s == '/') {
+			if (!name_matched)
+				goto look_for_next;
+			goto match_type;
+		}
+
+		/*
+		 * We are now trying to match the device name.  This also
+		 * updates the name_matched variable.  If, while reading the
+		 * spec, we ecnounter comma it means that the pattern does not
+		 * match and we need to start over with another pattern (the
+		 * one afther the comma).  If we encounter equal sign we need
+		 * to start over with another rule.  If there is a character
+		 * that does not match, we neet to look for a comma (to get
+		 * another pattern) or semicolon (to get another rule) and try
+		 * again if there is one somewhere.
+		 */
+
+		name_matched = 0;
+
+		for (c = name; *s != '*' && *c; ++c, ++s)
+			if (*s == '=')
+				goto next_rule;
+			else if (*s == ',')
+				goto next_pattern;
+			else if (*s != '?' && *c != *s)
+				goto look_for_next;
+		if (*s == '*')
+			++s;
+
+		name_matched = 1;
+
+		/*
+		 * Now we need to match the type part of the pattern.  If the
+		 * pattern is missing it we match only if type points to an
+		 * empty string.  Otherwise wy try to match it just like name.
+		 */
+		if (*s == '/') {
+match_type:		/* s points to '/' */
+			++s;
+
+			for (c = type; *s && *c; ++c, ++s)
+				if (*s == '=')
+					goto next_rule;
+				else if (*s == ',')
+					goto next_pattern;
+				else if (*c != *s)
+					goto look_for_next;
+		}
+
+		/* Return the string behind the '=' sign of the rule. */
+		if (*s == '=')
+			return s + 1;
+		else if (*s == ',')
+			return strchr(s, '=') + 1;
+
+		/* Pattern did not match */
+
+look_for_next:
+		do {
+			++s;
+		} while (*s != ',' && *s != '=');
+		if (*s == ',')
+			continue;
+
+next_rule:	/* s points to '=' */
+		s = strchr(s, ';');
+		if (!s)
+			break;
+
+next_pattern:
+		continue;
+	}
+
+	return ERR_PTR(-ENOENT);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 3/9] mm: cma: Added SysFS support
  2010-09-06  6:33 ` Michal Nazarewicz
  (?)
@ 2010-09-06  6:33   ` Michal Nazarewicz
  -1 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

The SysFS development interface lets one change the map attribute
at run time as well as observe what regions have been reserved.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 .../ABI/testing/sysfs-kernel-mm-contiguous         |   53 +++
 Documentation/contiguous-memory.txt                |    4 +
 include/linux/cma.h                                |    7 +
 mm/Kconfig                                         |   26 ++-
 mm/cma.c                                           |  345 +++++++++++++++++++-
 5 files changed, 430 insertions(+), 5 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous

diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-contiguous b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
new file mode 100644
index 0000000..8df15bc
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
@@ -0,0 +1,53 @@
+What:		/sys/kernel/mm/contiguous/
+Date:		August 2010
+Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
+Description:
+		If CMA has been built with SysFS support,
+		/sys/kernel/mm/contiguous/ contains a file called
+		"map", a file called "allocators" and a directory
+		called "regions".
+
+		The "map" file lets one change the CMA's map attribute
+		at run-time.
+
+		The "allocators" file list all registered allocators.
+		Allocators with no name are listed as a single minus
+		sign.
+
+		The "regions" directory list all reserved regions.
+
+		For more details see
+		Documentation/contiguous-memory.txt.
+
+What:		/sys/kernel/mm/contiguous/regions/
+Date:		August 2010
+Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
+Description:
+		The /sys/kernel/mm/contiguous/regions/ directory
+		contain directories for each registered CMA region.
+		The name of the directory is the same as the start
+		address of the region.
+
+		If region is named there is also a symbolic link named
+		like the region pointing to the region's directory.
+
+		Such directory contains the following files:
+
+		* "name"  -- the name of the region or an empty file
+		* "start" -- starting address of the region (formatted
+		            with %p, ie. hex).
+		* "size"  -- size of the region (in bytes).
+		* "free"  -- free space in the region (in bytes).
+		* "users" -- number of chunks allocated in the region.
+		* "alloc" -- name of the allocator.
+
+		If allocator is not attached to the region, "alloc" is
+		either the name of desired allocator in square
+		brackets (ie. "[foo]") or an empty file if region is
+		to be attached to default allocator.  If an allocator
+		is attached to the region. "alloc" is either its name
+		or "-" if attached allocator has no name.
+
+		If there are no chunks allocated in given region
+		("users" is "0") then a name of desired allocator can
+		be written to "alloc".
diff --git a/Documentation/contiguous-memory.txt b/Documentation/contiguous-memory.txt
index e470c6f..15aff7a 100644
--- a/Documentation/contiguous-memory.txt
+++ b/Documentation/contiguous-memory.txt
@@ -256,6 +256,10 @@
      iff it matched in previous pattern.  If the second part is
      omitted it will mach any type of memory requested by device.
 
+     If SysFS support is enabled, this attribute is accessible via
+     SysFS and can be changed at run-time by writing to
+     /sys/kernel/mm/contiguous/map.
+
      Some examples (whitespace added for better readability):
 
          cma_map = foo/quaz = r1;
diff --git a/include/linux/cma.h b/include/linux/cma.h
index f6f9cb5..d0f41f4 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -18,6 +18,9 @@
 #include <linux/rbtree.h>
 #include <linux/list.h>
 #include <linux/init.h>
+#if defined CONFIG_CMA_SYSFS
+#  include <linux/kobject.h>
+#endif
 
 
 struct device;
@@ -204,6 +207,10 @@ struct cma_region {
 	unsigned users;
 	struct list_head list;
 
+#if defined CONFIG_CMA_SYSFS
+	struct kobject kobj;
+#endif
+
 	unsigned used:1;
 	unsigned registered:1;
 	unsigned reserved:1;
diff --git a/mm/Kconfig b/mm/Kconfig
index 86043a3..8bed799 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -322,16 +322,36 @@ config CMA
 	  For more information see <Documentation/contiguous-memory.txt>.
 	  If unsure, say "n".
 
-config CMA_DEBUG
-	bool "CMA debug messages (DEVELOPEMENT)"
+config CMA_DEVELOPEMENT
+	bool "Include CMA developement features"
 	depends on CMA
 	help
+	  This lets you enable some developement features of the CMA
+	  framework.  It does not add any code to the kernel.
+
+	  Those options are mostly usable during development and testing.
+	  If unsure, say "n".
+
+config CMA_DEBUG
+	bool "CMA debug messages"
+	depends on CMA_DEVELOPEMENT
+	help
 	  Turns on debug messages in CMA.  This produces KERN_DEBUG
 	  messages for every CMA call as well as various messages while
 	  processing calls such as cma_alloc().  This option does not
 	  affect warning and error messages.
 
-	  This is mostly used during development.  If unsure, say "n".
+config CMA_SYSFS
+	bool "CMA SysFS interface support"
+	depends on CMA_DEVELOPEMENT
+	help
+	  Enable support for SysFS interface.  The interface is available
+	  under /sys/kernel/mm/contiguous.  Each region and allocator is
+	  represented there.
+
+	  For more information consult
+	  <Documentation/contiguous-memory.txt> and
+	  <Documentation/ABI/testing/sysfs-kernel-mm-contiguous> files.
 
 config CMA_BEST_FIT
 	bool "CMA best-fit allocator"
diff --git a/mm/cma.c b/mm/cma.c
index 06d0d5a..955f08c 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -38,8 +38,8 @@
 
 
 /*
- * Protects cma_regions, cma_allocators, cma_map, cma_map_length, and
- * cma_chunks_by_start.
+ * Protects cma_regions, cma_allocators, cma_map, cma_map_length,
+ * cma_kobj, cma_sysfs_regions and cma_chunks_by_start.
  */
 static DEFINE_MUTEX(cma_mutex);
 
@@ -143,7 +143,11 @@ int __init __must_check cma_early_region_register(struct cma_region *reg)
 
 /************************* Regions & Allocators *************************/
 
+static void __cma_sysfs_region_add(struct cma_region *reg);
+
 static int __cma_region_attach_alloc(struct cma_region *reg);
+static void __maybe_unused __cma_region_detach_alloc(struct cma_region *reg);
+
 
 /* List of all regions.  Named regions are kept before unnamed. */
 static LIST_HEAD(cma_regions);
@@ -226,6 +230,8 @@ int __must_check cma_region_register(struct cma_region *reg)
 	else
 		list_add_tail(&reg->list, &cma_regions);
 
+	__cma_sysfs_region_add(reg);
+
 done:
 	mutex_unlock(&cma_mutex);
 
@@ -482,6 +488,329 @@ subsys_initcall(cma_init);
 
 
 
+/************************* SysFS *************************/
+
+#if defined CONFIG_CMA_SYSFS
+
+static struct kobject cma_sysfs_regions;
+static int cma_sysfs_regions_ready;
+
+
+#define CMA_ATTR_INLINE(_type, _name)					\
+	(&((struct cma_ ## _type ## _attribute){			\
+		.attr	= {						\
+			.name	= __stringify(_name),			\
+			.mode	= 0644,					\
+		},							\
+		.show	= cma_sysfs_ ## _type ## _ ## _name ## _show,	\
+		.store	= cma_sysfs_ ## _type ## _ ## _name ## _store,	\
+	}).attr)
+
+#define CMA_ATTR_RO_INLINE(_type, _name)				\
+	(&((struct cma_ ## _type ## _attribute){			\
+		.attr	= {						\
+			.name	= __stringify(_name),			\
+			.mode	= 0444,					\
+		},							\
+		.show	= cma_sysfs_ ## _type ## _ ## _name ## _show,	\
+	}).attr)
+
+
+struct cma_root_attribute {
+	struct attribute attr;
+	ssize_t (*show)(char *buf);
+	int (*store)(const char *buf);
+};
+
+static ssize_t cma_sysfs_root_map_show(char *page)
+{
+	ssize_t len;
+
+	len = cma_map_length;
+	if (!len) {
+		*page = 0;
+		len = 0;
+	} else {
+		if (len > (size_t)PAGE_SIZE - 1)
+			len = (size_t)PAGE_SIZE - 1;
+		memcpy(page, cma_map, len);
+		page[len++] = '\n';
+	}
+
+	return len;
+}
+
+static int cma_sysfs_root_map_store(const char *page)
+{
+	ssize_t len = cma_map_validate(page);
+	char *val = NULL;
+
+	if (len < 0)
+		return len;
+
+	if (len) {
+		val = kmemdup(page, len + 1, GFP_KERNEL);
+		if (!val)
+			return -ENOMEM;
+		val[len] = '\0';
+	}
+
+	kfree(cma_map);
+	cma_map = val;
+	cma_map_length = len;
+
+	return 0;
+}
+
+static ssize_t cma_sysfs_root_allocators_show(char *page)
+{
+	struct cma_allocator *alloc;
+	size_t left = PAGE_SIZE;
+	char *ch = page;
+
+	cma_foreach_allocator(alloc) {
+		ssize_t l = snprintf(ch, left, "%s ", alloc->name ?: "-");
+		ch   += l;
+		left -= l;
+	}
+
+	if (ch != page)
+		ch[-1] = '\n';
+	return ch - page;
+}
+
+static ssize_t
+cma_sysfs_root_show(struct kobject *kobj, struct attribute *attr, char *buf)
+{
+	struct cma_root_attribute *rattr =
+		container_of(attr, struct cma_root_attribute, attr);
+	ssize_t ret;
+
+	mutex_lock(&cma_mutex);
+	ret = rattr->show(buf);
+	mutex_unlock(&cma_mutex);
+
+	return ret;
+}
+
+static ssize_t
+cma_sysfs_root_store(struct kobject *kobj, struct attribute *attr,
+		       const char *buf, size_t count)
+{
+	struct cma_root_attribute *rattr =
+		container_of(attr, struct cma_root_attribute, attr);
+	int ret;
+
+	mutex_lock(&cma_mutex);
+	ret = rattr->store(buf);
+	mutex_unlock(&cma_mutex);
+
+	return ret < 0 ? ret : count;
+}
+
+static struct kobj_type cma_sysfs_root_type = {
+	.sysfs_ops	= &(const struct sysfs_ops){
+		.show	= cma_sysfs_root_show,
+		.store	= cma_sysfs_root_store,
+	},
+	.default_attrs	= (struct attribute * []) {
+		CMA_ATTR_INLINE(root, map),
+		CMA_ATTR_RO_INLINE(root, allocators),
+		NULL
+	},
+};
+
+static int __init cma_sysfs_init(void)
+{
+	static struct kobject root;
+	static struct kobj_type fake_type;
+
+	struct cma_region *reg;
+	int ret;
+
+	/* Root */
+	ret = kobject_init_and_add(&root, &cma_sysfs_root_type,
+				   mm_kobj, "contiguous");
+	if (unlikely(ret < 0)) {
+		pr_err("init: unable to add root kobject: %d\n", ret);
+		return ret;
+	}
+
+	/* Regions */
+	ret = kobject_init_and_add(&cma_sysfs_regions, &fake_type,
+				   &root, "regions");
+	if (unlikely(ret < 0)) {
+		pr_err("init: unable to add regions kobject: %d\n", ret);
+		return ret;
+	}
+
+	mutex_lock(&cma_mutex);
+	cma_sysfs_regions_ready = 1;
+	cma_foreach_region(reg)
+		__cma_sysfs_region_add(reg);
+	mutex_unlock(&cma_mutex);
+
+	return 0;
+}
+device_initcall(cma_sysfs_init);
+
+
+
+struct cma_region_attribute {
+	struct attribute attr;
+	ssize_t (*show)(struct cma_region *reg, char *buf);
+	int (*store)(struct cma_region *reg, const char *buf);
+};
+
+
+static ssize_t cma_sysfs_region_name_show(struct cma_region *reg, char *page)
+{
+	return reg->name ? snprintf(page, PAGE_SIZE, "%s\n", reg->name) : 0;
+}
+
+static ssize_t cma_sysfs_region_start_show(struct cma_region *reg, char *page)
+{
+	return snprintf(page, PAGE_SIZE, "%p\n", (void *)reg->start);
+}
+
+static ssize_t cma_sysfs_region_size_show(struct cma_region *reg, char *page)
+{
+	return snprintf(page, PAGE_SIZE, "%zu\n", reg->size);
+}
+
+static ssize_t cma_sysfs_region_free_show(struct cma_region *reg, char *page)
+{
+	return snprintf(page, PAGE_SIZE, "%zu\n", reg->free_space);
+}
+
+static ssize_t cma_sysfs_region_users_show(struct cma_region *reg, char *page)
+{
+	return snprintf(page, PAGE_SIZE, "%u\n", reg->users);
+}
+
+static ssize_t cma_sysfs_region_alloc_show(struct cma_region *reg, char *page)
+{
+	if (reg->alloc)
+		return snprintf(page, PAGE_SIZE, "%s\n",
+				reg->alloc->name ?: "-");
+	else if (reg->alloc_name)
+		return snprintf(page, PAGE_SIZE, "[%s]\n", reg->alloc_name);
+	else
+		return 0;
+}
+
+static int
+cma_sysfs_region_alloc_store(struct cma_region *reg, const char *page)
+{
+	char *s;
+
+	if (reg->alloc && reg->users)
+		return -EBUSY;
+
+	if (!*page || *page == '\n') {
+		s = NULL;
+	} else {
+		size_t len;
+
+		for (s = (char *)page; *++s && *s != '\n'; )
+			/* nop */;
+
+		len = s - page;
+		s = kmemdup(page, len + 1, GFP_KERNEL);
+		if (!s)
+			return -ENOMEM;
+		s[len] = '\0';
+	}
+
+	if (reg->alloc)
+		__cma_region_detach_alloc(reg);
+
+	if (reg->free_alloc_name)
+		kfree(reg->alloc_name);
+
+	reg->alloc_name = s;
+	reg->free_alloc_name = !!s;
+
+	return 0;
+}
+
+
+static ssize_t
+cma_sysfs_region_show(struct kobject *kobj, struct attribute *attr,
+		      char *buf)
+{
+	struct cma_region *reg = container_of(kobj, struct cma_region, kobj);
+	struct cma_region_attribute *rattr =
+		container_of(attr, struct cma_region_attribute, attr);
+	ssize_t ret;
+
+	mutex_lock(&cma_mutex);
+	ret = rattr->show(reg, buf);
+	mutex_unlock(&cma_mutex);
+
+	return ret;
+}
+
+static int
+cma_sysfs_region_store(struct kobject *kobj, struct attribute *attr,
+		       const char *buf, size_t count)
+{
+	struct cma_region *reg = container_of(kobj, struct cma_region, kobj);
+	struct cma_region_attribute *rattr =
+		container_of(attr, struct cma_region_attribute, attr);
+	int ret;
+
+	mutex_lock(&cma_mutex);
+	ret = rattr->store(reg, buf);
+	mutex_unlock(&cma_mutex);
+
+	return ret < 0 ? ret : count;
+}
+
+static struct kobj_type cma_sysfs_region_type = {
+	.sysfs_ops	= &(const struct sysfs_ops){
+		.show	= cma_sysfs_region_show,
+		.store	= cma_sysfs_region_store,
+	},
+	.default_attrs	= (struct attribute * []) {
+		CMA_ATTR_RO_INLINE(region, name),
+		CMA_ATTR_RO_INLINE(region, start),
+		CMA_ATTR_RO_INLINE(region, size),
+		CMA_ATTR_RO_INLINE(region, free),
+		CMA_ATTR_RO_INLINE(region, users),
+		CMA_ATTR_INLINE(region, alloc),
+		NULL
+	},
+};
+
+static void __cma_sysfs_region_add(struct cma_region *reg)
+{
+	int ret;
+
+	if (!cma_sysfs_regions_ready)
+		return;
+
+	memset(&reg->kobj, 0, sizeof reg->kobj);
+
+	ret = kobject_init_and_add(&reg->kobj, &cma_sysfs_region_type,
+				   &cma_sysfs_regions,
+				   "%p", (void *)reg->start);
+
+	if (reg->name &&
+	    sysfs_create_link(&cma_sysfs_regions, &reg->kobj, reg->name) < 0)
+		/* Ignore any errors. */;
+}
+
+#else
+
+static void __cma_sysfs_region_add(struct cma_region *reg)
+{
+	/* nop */
+}
+
+#endif
+
+
 /************************* Chunks *************************/
 
 /* All chunks sorted by start address. */
@@ -785,6 +1114,18 @@ static int __cma_region_attach_alloc(struct cma_region *reg)
 	return ret;
 }
 
+static void __cma_region_detach_alloc(struct cma_region *reg)
+{
+	if (!reg->alloc)
+		return;
+
+	if (reg->alloc->cleanup)
+		reg->alloc->cleanup(reg);
+
+	reg->alloc = NULL;
+	reg->used = 1;
+}
+
 
 /*
  * s            ::= rules
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 3/9] mm: cma: Added SysFS support
@ 2010-09-06  6:33   ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

The SysFS development interface lets one change the map attribute
at run time as well as observe what regions have been reserved.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 .../ABI/testing/sysfs-kernel-mm-contiguous         |   53 +++
 Documentation/contiguous-memory.txt                |    4 +
 include/linux/cma.h                                |    7 +
 mm/Kconfig                                         |   26 ++-
 mm/cma.c                                           |  345 +++++++++++++++++++-
 5 files changed, 430 insertions(+), 5 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous

diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-contiguous b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
new file mode 100644
index 0000000..8df15bc
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
@@ -0,0 +1,53 @@
+What:		/sys/kernel/mm/contiguous/
+Date:		August 2010
+Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
+Description:
+		If CMA has been built with SysFS support,
+		/sys/kernel/mm/contiguous/ contains a file called
+		"map", a file called "allocators" and a directory
+		called "regions".
+
+		The "map" file lets one change the CMA's map attribute
+		at run-time.
+
+		The "allocators" file list all registered allocators.
+		Allocators with no name are listed as a single minus
+		sign.
+
+		The "regions" directory list all reserved regions.
+
+		For more details see
+		Documentation/contiguous-memory.txt.
+
+What:		/sys/kernel/mm/contiguous/regions/
+Date:		August 2010
+Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
+Description:
+		The /sys/kernel/mm/contiguous/regions/ directory
+		contain directories for each registered CMA region.
+		The name of the directory is the same as the start
+		address of the region.
+
+		If region is named there is also a symbolic link named
+		like the region pointing to the region's directory.
+
+		Such directory contains the following files:
+
+		* "name"  -- the name of the region or an empty file
+		* "start" -- starting address of the region (formatted
+		            with %p, ie. hex).
+		* "size"  -- size of the region (in bytes).
+		* "free"  -- free space in the region (in bytes).
+		* "users" -- number of chunks allocated in the region.
+		* "alloc" -- name of the allocator.
+
+		If allocator is not attached to the region, "alloc" is
+		either the name of desired allocator in square
+		brackets (ie. "[foo]") or an empty file if region is
+		to be attached to default allocator.  If an allocator
+		is attached to the region. "alloc" is either its name
+		or "-" if attached allocator has no name.
+
+		If there are no chunks allocated in given region
+		("users" is "0") then a name of desired allocator can
+		be written to "alloc".
diff --git a/Documentation/contiguous-memory.txt b/Documentation/contiguous-memory.txt
index e470c6f..15aff7a 100644
--- a/Documentation/contiguous-memory.txt
+++ b/Documentation/contiguous-memory.txt
@@ -256,6 +256,10 @@
      iff it matched in previous pattern.  If the second part is
      omitted it will mach any type of memory requested by device.
 
+     If SysFS support is enabled, this attribute is accessible via
+     SysFS and can be changed at run-time by writing to
+     /sys/kernel/mm/contiguous/map.
+
      Some examples (whitespace added for better readability):
 
          cma_map = foo/quaz = r1;
diff --git a/include/linux/cma.h b/include/linux/cma.h
index f6f9cb5..d0f41f4 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -18,6 +18,9 @@
 #include <linux/rbtree.h>
 #include <linux/list.h>
 #include <linux/init.h>
+#if defined CONFIG_CMA_SYSFS
+#  include <linux/kobject.h>
+#endif
 
 
 struct device;
@@ -204,6 +207,10 @@ struct cma_region {
 	unsigned users;
 	struct list_head list;
 
+#if defined CONFIG_CMA_SYSFS
+	struct kobject kobj;
+#endif
+
 	unsigned used:1;
 	unsigned registered:1;
 	unsigned reserved:1;
diff --git a/mm/Kconfig b/mm/Kconfig
index 86043a3..8bed799 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -322,16 +322,36 @@ config CMA
 	  For more information see <Documentation/contiguous-memory.txt>.
 	  If unsure, say "n".
 
-config CMA_DEBUG
-	bool "CMA debug messages (DEVELOPEMENT)"
+config CMA_DEVELOPEMENT
+	bool "Include CMA developement features"
 	depends on CMA
 	help
+	  This lets you enable some developement features of the CMA
+	  framework.  It does not add any code to the kernel.
+
+	  Those options are mostly usable during development and testing.
+	  If unsure, say "n".
+
+config CMA_DEBUG
+	bool "CMA debug messages"
+	depends on CMA_DEVELOPEMENT
+	help
 	  Turns on debug messages in CMA.  This produces KERN_DEBUG
 	  messages for every CMA call as well as various messages while
 	  processing calls such as cma_alloc().  This option does not
 	  affect warning and error messages.
 
-	  This is mostly used during development.  If unsure, say "n".
+config CMA_SYSFS
+	bool "CMA SysFS interface support"
+	depends on CMA_DEVELOPEMENT
+	help
+	  Enable support for SysFS interface.  The interface is available
+	  under /sys/kernel/mm/contiguous.  Each region and allocator is
+	  represented there.
+
+	  For more information consult
+	  <Documentation/contiguous-memory.txt> and
+	  <Documentation/ABI/testing/sysfs-kernel-mm-contiguous> files.
 
 config CMA_BEST_FIT
 	bool "CMA best-fit allocator"
diff --git a/mm/cma.c b/mm/cma.c
index 06d0d5a..955f08c 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -38,8 +38,8 @@
 
 
 /*
- * Protects cma_regions, cma_allocators, cma_map, cma_map_length, and
- * cma_chunks_by_start.
+ * Protects cma_regions, cma_allocators, cma_map, cma_map_length,
+ * cma_kobj, cma_sysfs_regions and cma_chunks_by_start.
  */
 static DEFINE_MUTEX(cma_mutex);
 
@@ -143,7 +143,11 @@ int __init __must_check cma_early_region_register(struct cma_region *reg)
 
 /************************* Regions & Allocators *************************/
 
+static void __cma_sysfs_region_add(struct cma_region *reg);
+
 static int __cma_region_attach_alloc(struct cma_region *reg);
+static void __maybe_unused __cma_region_detach_alloc(struct cma_region *reg);
+
 
 /* List of all regions.  Named regions are kept before unnamed. */
 static LIST_HEAD(cma_regions);
@@ -226,6 +230,8 @@ int __must_check cma_region_register(struct cma_region *reg)
 	else
 		list_add_tail(&reg->list, &cma_regions);
 
+	__cma_sysfs_region_add(reg);
+
 done:
 	mutex_unlock(&cma_mutex);
 
@@ -482,6 +488,329 @@ subsys_initcall(cma_init);
 
 
 
+/************************* SysFS *************************/
+
+#if defined CONFIG_CMA_SYSFS
+
+static struct kobject cma_sysfs_regions;
+static int cma_sysfs_regions_ready;
+
+
+#define CMA_ATTR_INLINE(_type, _name)					\
+	(&((struct cma_ ## _type ## _attribute){			\
+		.attr	= {						\
+			.name	= __stringify(_name),			\
+			.mode	= 0644,					\
+		},							\
+		.show	= cma_sysfs_ ## _type ## _ ## _name ## _show,	\
+		.store	= cma_sysfs_ ## _type ## _ ## _name ## _store,	\
+	}).attr)
+
+#define CMA_ATTR_RO_INLINE(_type, _name)				\
+	(&((struct cma_ ## _type ## _attribute){			\
+		.attr	= {						\
+			.name	= __stringify(_name),			\
+			.mode	= 0444,					\
+		},							\
+		.show	= cma_sysfs_ ## _type ## _ ## _name ## _show,	\
+	}).attr)
+
+
+struct cma_root_attribute {
+	struct attribute attr;
+	ssize_t (*show)(char *buf);
+	int (*store)(const char *buf);
+};
+
+static ssize_t cma_sysfs_root_map_show(char *page)
+{
+	ssize_t len;
+
+	len = cma_map_length;
+	if (!len) {
+		*page = 0;
+		len = 0;
+	} else {
+		if (len > (size_t)PAGE_SIZE - 1)
+			len = (size_t)PAGE_SIZE - 1;
+		memcpy(page, cma_map, len);
+		page[len++] = '\n';
+	}
+
+	return len;
+}
+
+static int cma_sysfs_root_map_store(const char *page)
+{
+	ssize_t len = cma_map_validate(page);
+	char *val = NULL;
+
+	if (len < 0)
+		return len;
+
+	if (len) {
+		val = kmemdup(page, len + 1, GFP_KERNEL);
+		if (!val)
+			return -ENOMEM;
+		val[len] = '\0';
+	}
+
+	kfree(cma_map);
+	cma_map = val;
+	cma_map_length = len;
+
+	return 0;
+}
+
+static ssize_t cma_sysfs_root_allocators_show(char *page)
+{
+	struct cma_allocator *alloc;
+	size_t left = PAGE_SIZE;
+	char *ch = page;
+
+	cma_foreach_allocator(alloc) {
+		ssize_t l = snprintf(ch, left, "%s ", alloc->name ?: "-");
+		ch   += l;
+		left -= l;
+	}
+
+	if (ch != page)
+		ch[-1] = '\n';
+	return ch - page;
+}
+
+static ssize_t
+cma_sysfs_root_show(struct kobject *kobj, struct attribute *attr, char *buf)
+{
+	struct cma_root_attribute *rattr =
+		container_of(attr, struct cma_root_attribute, attr);
+	ssize_t ret;
+
+	mutex_lock(&cma_mutex);
+	ret = rattr->show(buf);
+	mutex_unlock(&cma_mutex);
+
+	return ret;
+}
+
+static ssize_t
+cma_sysfs_root_store(struct kobject *kobj, struct attribute *attr,
+		       const char *buf, size_t count)
+{
+	struct cma_root_attribute *rattr =
+		container_of(attr, struct cma_root_attribute, attr);
+	int ret;
+
+	mutex_lock(&cma_mutex);
+	ret = rattr->store(buf);
+	mutex_unlock(&cma_mutex);
+
+	return ret < 0 ? ret : count;
+}
+
+static struct kobj_type cma_sysfs_root_type = {
+	.sysfs_ops	= &(const struct sysfs_ops){
+		.show	= cma_sysfs_root_show,
+		.store	= cma_sysfs_root_store,
+	},
+	.default_attrs	= (struct attribute * []) {
+		CMA_ATTR_INLINE(root, map),
+		CMA_ATTR_RO_INLINE(root, allocators),
+		NULL
+	},
+};
+
+static int __init cma_sysfs_init(void)
+{
+	static struct kobject root;
+	static struct kobj_type fake_type;
+
+	struct cma_region *reg;
+	int ret;
+
+	/* Root */
+	ret = kobject_init_and_add(&root, &cma_sysfs_root_type,
+				   mm_kobj, "contiguous");
+	if (unlikely(ret < 0)) {
+		pr_err("init: unable to add root kobject: %d\n", ret);
+		return ret;
+	}
+
+	/* Regions */
+	ret = kobject_init_and_add(&cma_sysfs_regions, &fake_type,
+				   &root, "regions");
+	if (unlikely(ret < 0)) {
+		pr_err("init: unable to add regions kobject: %d\n", ret);
+		return ret;
+	}
+
+	mutex_lock(&cma_mutex);
+	cma_sysfs_regions_ready = 1;
+	cma_foreach_region(reg)
+		__cma_sysfs_region_add(reg);
+	mutex_unlock(&cma_mutex);
+
+	return 0;
+}
+device_initcall(cma_sysfs_init);
+
+
+
+struct cma_region_attribute {
+	struct attribute attr;
+	ssize_t (*show)(struct cma_region *reg, char *buf);
+	int (*store)(struct cma_region *reg, const char *buf);
+};
+
+
+static ssize_t cma_sysfs_region_name_show(struct cma_region *reg, char *page)
+{
+	return reg->name ? snprintf(page, PAGE_SIZE, "%s\n", reg->name) : 0;
+}
+
+static ssize_t cma_sysfs_region_start_show(struct cma_region *reg, char *page)
+{
+	return snprintf(page, PAGE_SIZE, "%p\n", (void *)reg->start);
+}
+
+static ssize_t cma_sysfs_region_size_show(struct cma_region *reg, char *page)
+{
+	return snprintf(page, PAGE_SIZE, "%zu\n", reg->size);
+}
+
+static ssize_t cma_sysfs_region_free_show(struct cma_region *reg, char *page)
+{
+	return snprintf(page, PAGE_SIZE, "%zu\n", reg->free_space);
+}
+
+static ssize_t cma_sysfs_region_users_show(struct cma_region *reg, char *page)
+{
+	return snprintf(page, PAGE_SIZE, "%u\n", reg->users);
+}
+
+static ssize_t cma_sysfs_region_alloc_show(struct cma_region *reg, char *page)
+{
+	if (reg->alloc)
+		return snprintf(page, PAGE_SIZE, "%s\n",
+				reg->alloc->name ?: "-");
+	else if (reg->alloc_name)
+		return snprintf(page, PAGE_SIZE, "[%s]\n", reg->alloc_name);
+	else
+		return 0;
+}
+
+static int
+cma_sysfs_region_alloc_store(struct cma_region *reg, const char *page)
+{
+	char *s;
+
+	if (reg->alloc && reg->users)
+		return -EBUSY;
+
+	if (!*page || *page == '\n') {
+		s = NULL;
+	} else {
+		size_t len;
+
+		for (s = (char *)page; *++s && *s != '\n'; )
+			/* nop */;
+
+		len = s - page;
+		s = kmemdup(page, len + 1, GFP_KERNEL);
+		if (!s)
+			return -ENOMEM;
+		s[len] = '\0';
+	}
+
+	if (reg->alloc)
+		__cma_region_detach_alloc(reg);
+
+	if (reg->free_alloc_name)
+		kfree(reg->alloc_name);
+
+	reg->alloc_name = s;
+	reg->free_alloc_name = !!s;
+
+	return 0;
+}
+
+
+static ssize_t
+cma_sysfs_region_show(struct kobject *kobj, struct attribute *attr,
+		      char *buf)
+{
+	struct cma_region *reg = container_of(kobj, struct cma_region, kobj);
+	struct cma_region_attribute *rattr =
+		container_of(attr, struct cma_region_attribute, attr);
+	ssize_t ret;
+
+	mutex_lock(&cma_mutex);
+	ret = rattr->show(reg, buf);
+	mutex_unlock(&cma_mutex);
+
+	return ret;
+}
+
+static int
+cma_sysfs_region_store(struct kobject *kobj, struct attribute *attr,
+		       const char *buf, size_t count)
+{
+	struct cma_region *reg = container_of(kobj, struct cma_region, kobj);
+	struct cma_region_attribute *rattr =
+		container_of(attr, struct cma_region_attribute, attr);
+	int ret;
+
+	mutex_lock(&cma_mutex);
+	ret = rattr->store(reg, buf);
+	mutex_unlock(&cma_mutex);
+
+	return ret < 0 ? ret : count;
+}
+
+static struct kobj_type cma_sysfs_region_type = {
+	.sysfs_ops	= &(const struct sysfs_ops){
+		.show	= cma_sysfs_region_show,
+		.store	= cma_sysfs_region_store,
+	},
+	.default_attrs	= (struct attribute * []) {
+		CMA_ATTR_RO_INLINE(region, name),
+		CMA_ATTR_RO_INLINE(region, start),
+		CMA_ATTR_RO_INLINE(region, size),
+		CMA_ATTR_RO_INLINE(region, free),
+		CMA_ATTR_RO_INLINE(region, users),
+		CMA_ATTR_INLINE(region, alloc),
+		NULL
+	},
+};
+
+static void __cma_sysfs_region_add(struct cma_region *reg)
+{
+	int ret;
+
+	if (!cma_sysfs_regions_ready)
+		return;
+
+	memset(&reg->kobj, 0, sizeof reg->kobj);
+
+	ret = kobject_init_and_add(&reg->kobj, &cma_sysfs_region_type,
+				   &cma_sysfs_regions,
+				   "%p", (void *)reg->start);
+
+	if (reg->name &&
+	    sysfs_create_link(&cma_sysfs_regions, &reg->kobj, reg->name) < 0)
+		/* Ignore any errors. */;
+}
+
+#else
+
+static void __cma_sysfs_region_add(struct cma_region *reg)
+{
+	/* nop */
+}
+
+#endif
+
+
 /************************* Chunks *************************/
 
 /* All chunks sorted by start address. */
@@ -785,6 +1114,18 @@ static int __cma_region_attach_alloc(struct cma_region *reg)
 	return ret;
 }
 
+static void __cma_region_detach_alloc(struct cma_region *reg)
+{
+	if (!reg->alloc)
+		return;
+
+	if (reg->alloc->cleanup)
+		reg->alloc->cleanup(reg);
+
+	reg->alloc = NULL;
+	reg->used = 1;
+}
+
 
 /*
  * s            ::= rules
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 3/9] mm: cma: Added SysFS support
@ 2010-09-06  6:33   ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel

The SysFS development interface lets one change the map attribute
at run time as well as observe what regions have been reserved.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 .../ABI/testing/sysfs-kernel-mm-contiguous         |   53 +++
 Documentation/contiguous-memory.txt                |    4 +
 include/linux/cma.h                                |    7 +
 mm/Kconfig                                         |   26 ++-
 mm/cma.c                                           |  345 +++++++++++++++++++-
 5 files changed, 430 insertions(+), 5 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous

diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-contiguous b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
new file mode 100644
index 0000000..8df15bc
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
@@ -0,0 +1,53 @@
+What:		/sys/kernel/mm/contiguous/
+Date:		August 2010
+Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
+Description:
+		If CMA has been built with SysFS support,
+		/sys/kernel/mm/contiguous/ contains a file called
+		"map", a file called "allocators" and a directory
+		called "regions".
+
+		The "map" file lets one change the CMA's map attribute
+		at run-time.
+
+		The "allocators" file list all registered allocators.
+		Allocators with no name are listed as a single minus
+		sign.
+
+		The "regions" directory list all reserved regions.
+
+		For more details see
+		Documentation/contiguous-memory.txt.
+
+What:		/sys/kernel/mm/contiguous/regions/
+Date:		August 2010
+Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
+Description:
+		The /sys/kernel/mm/contiguous/regions/ directory
+		contain directories for each registered CMA region.
+		The name of the directory is the same as the start
+		address of the region.
+
+		If region is named there is also a symbolic link named
+		like the region pointing to the region's directory.
+
+		Such directory contains the following files:
+
+		* "name"  -- the name of the region or an empty file
+		* "start" -- starting address of the region (formatted
+		            with %p, ie. hex).
+		* "size"  -- size of the region (in bytes).
+		* "free"  -- free space in the region (in bytes).
+		* "users" -- number of chunks allocated in the region.
+		* "alloc" -- name of the allocator.
+
+		If allocator is not attached to the region, "alloc" is
+		either the name of desired allocator in square
+		brackets (ie. "[foo]") or an empty file if region is
+		to be attached to default allocator.  If an allocator
+		is attached to the region. "alloc" is either its name
+		or "-" if attached allocator has no name.
+
+		If there are no chunks allocated in given region
+		("users" is "0") then a name of desired allocator can
+		be written to "alloc".
diff --git a/Documentation/contiguous-memory.txt b/Documentation/contiguous-memory.txt
index e470c6f..15aff7a 100644
--- a/Documentation/contiguous-memory.txt
+++ b/Documentation/contiguous-memory.txt
@@ -256,6 +256,10 @@
      iff it matched in previous pattern.  If the second part is
      omitted it will mach any type of memory requested by device.
 
+     If SysFS support is enabled, this attribute is accessible via
+     SysFS and can be changed at run-time by writing to
+     /sys/kernel/mm/contiguous/map.
+
      Some examples (whitespace added for better readability):
 
          cma_map = foo/quaz = r1;
diff --git a/include/linux/cma.h b/include/linux/cma.h
index f6f9cb5..d0f41f4 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -18,6 +18,9 @@
 #include <linux/rbtree.h>
 #include <linux/list.h>
 #include <linux/init.h>
+#if defined CONFIG_CMA_SYSFS
+#  include <linux/kobject.h>
+#endif
 
 
 struct device;
@@ -204,6 +207,10 @@ struct cma_region {
 	unsigned users;
 	struct list_head list;
 
+#if defined CONFIG_CMA_SYSFS
+	struct kobject kobj;
+#endif
+
 	unsigned used:1;
 	unsigned registered:1;
 	unsigned reserved:1;
diff --git a/mm/Kconfig b/mm/Kconfig
index 86043a3..8bed799 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -322,16 +322,36 @@ config CMA
 	  For more information see <Documentation/contiguous-memory.txt>.
 	  If unsure, say "n".
 
-config CMA_DEBUG
-	bool "CMA debug messages (DEVELOPEMENT)"
+config CMA_DEVELOPEMENT
+	bool "Include CMA developement features"
 	depends on CMA
 	help
+	  This lets you enable some developement features of the CMA
+	  framework.  It does not add any code to the kernel.
+
+	  Those options are mostly usable during development and testing.
+	  If unsure, say "n".
+
+config CMA_DEBUG
+	bool "CMA debug messages"
+	depends on CMA_DEVELOPEMENT
+	help
 	  Turns on debug messages in CMA.  This produces KERN_DEBUG
 	  messages for every CMA call as well as various messages while
 	  processing calls such as cma_alloc().  This option does not
 	  affect warning and error messages.
 
-	  This is mostly used during development.  If unsure, say "n".
+config CMA_SYSFS
+	bool "CMA SysFS interface support"
+	depends on CMA_DEVELOPEMENT
+	help
+	  Enable support for SysFS interface.  The interface is available
+	  under /sys/kernel/mm/contiguous.  Each region and allocator is
+	  represented there.
+
+	  For more information consult
+	  <Documentation/contiguous-memory.txt> and
+	  <Documentation/ABI/testing/sysfs-kernel-mm-contiguous> files.
 
 config CMA_BEST_FIT
 	bool "CMA best-fit allocator"
diff --git a/mm/cma.c b/mm/cma.c
index 06d0d5a..955f08c 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -38,8 +38,8 @@
 
 
 /*
- * Protects cma_regions, cma_allocators, cma_map, cma_map_length, and
- * cma_chunks_by_start.
+ * Protects cma_regions, cma_allocators, cma_map, cma_map_length,
+ * cma_kobj, cma_sysfs_regions and cma_chunks_by_start.
  */
 static DEFINE_MUTEX(cma_mutex);
 
@@ -143,7 +143,11 @@ int __init __must_check cma_early_region_register(struct cma_region *reg)
 
 /************************* Regions & Allocators *************************/
 
+static void __cma_sysfs_region_add(struct cma_region *reg);
+
 static int __cma_region_attach_alloc(struct cma_region *reg);
+static void __maybe_unused __cma_region_detach_alloc(struct cma_region *reg);
+
 
 /* List of all regions.  Named regions are kept before unnamed. */
 static LIST_HEAD(cma_regions);
@@ -226,6 +230,8 @@ int __must_check cma_region_register(struct cma_region *reg)
 	else
 		list_add_tail(&reg->list, &cma_regions);
 
+	__cma_sysfs_region_add(reg);
+
 done:
 	mutex_unlock(&cma_mutex);
 
@@ -482,6 +488,329 @@ subsys_initcall(cma_init);
 
 
 
+/************************* SysFS *************************/
+
+#if defined CONFIG_CMA_SYSFS
+
+static struct kobject cma_sysfs_regions;
+static int cma_sysfs_regions_ready;
+
+
+#define CMA_ATTR_INLINE(_type, _name)					\
+	(&((struct cma_ ## _type ## _attribute){			\
+		.attr	= {						\
+			.name	= __stringify(_name),			\
+			.mode	= 0644,					\
+		},							\
+		.show	= cma_sysfs_ ## _type ## _ ## _name ## _show,	\
+		.store	= cma_sysfs_ ## _type ## _ ## _name ## _store,	\
+	}).attr)
+
+#define CMA_ATTR_RO_INLINE(_type, _name)				\
+	(&((struct cma_ ## _type ## _attribute){			\
+		.attr	= {						\
+			.name	= __stringify(_name),			\
+			.mode	= 0444,					\
+		},							\
+		.show	= cma_sysfs_ ## _type ## _ ## _name ## _show,	\
+	}).attr)
+
+
+struct cma_root_attribute {
+	struct attribute attr;
+	ssize_t (*show)(char *buf);
+	int (*store)(const char *buf);
+};
+
+static ssize_t cma_sysfs_root_map_show(char *page)
+{
+	ssize_t len;
+
+	len = cma_map_length;
+	if (!len) {
+		*page = 0;
+		len = 0;
+	} else {
+		if (len > (size_t)PAGE_SIZE - 1)
+			len = (size_t)PAGE_SIZE - 1;
+		memcpy(page, cma_map, len);
+		page[len++] = '\n';
+	}
+
+	return len;
+}
+
+static int cma_sysfs_root_map_store(const char *page)
+{
+	ssize_t len = cma_map_validate(page);
+	char *val = NULL;
+
+	if (len < 0)
+		return len;
+
+	if (len) {
+		val = kmemdup(page, len + 1, GFP_KERNEL);
+		if (!val)
+			return -ENOMEM;
+		val[len] = '\0';
+	}
+
+	kfree(cma_map);
+	cma_map = val;
+	cma_map_length = len;
+
+	return 0;
+}
+
+static ssize_t cma_sysfs_root_allocators_show(char *page)
+{
+	struct cma_allocator *alloc;
+	size_t left = PAGE_SIZE;
+	char *ch = page;
+
+	cma_foreach_allocator(alloc) {
+		ssize_t l = snprintf(ch, left, "%s ", alloc->name ?: "-");
+		ch   += l;
+		left -= l;
+	}
+
+	if (ch != page)
+		ch[-1] = '\n';
+	return ch - page;
+}
+
+static ssize_t
+cma_sysfs_root_show(struct kobject *kobj, struct attribute *attr, char *buf)
+{
+	struct cma_root_attribute *rattr =
+		container_of(attr, struct cma_root_attribute, attr);
+	ssize_t ret;
+
+	mutex_lock(&cma_mutex);
+	ret = rattr->show(buf);
+	mutex_unlock(&cma_mutex);
+
+	return ret;
+}
+
+static ssize_t
+cma_sysfs_root_store(struct kobject *kobj, struct attribute *attr,
+		       const char *buf, size_t count)
+{
+	struct cma_root_attribute *rattr =
+		container_of(attr, struct cma_root_attribute, attr);
+	int ret;
+
+	mutex_lock(&cma_mutex);
+	ret = rattr->store(buf);
+	mutex_unlock(&cma_mutex);
+
+	return ret < 0 ? ret : count;
+}
+
+static struct kobj_type cma_sysfs_root_type = {
+	.sysfs_ops	= &(const struct sysfs_ops){
+		.show	= cma_sysfs_root_show,
+		.store	= cma_sysfs_root_store,
+	},
+	.default_attrs	= (struct attribute * []) {
+		CMA_ATTR_INLINE(root, map),
+		CMA_ATTR_RO_INLINE(root, allocators),
+		NULL
+	},
+};
+
+static int __init cma_sysfs_init(void)
+{
+	static struct kobject root;
+	static struct kobj_type fake_type;
+
+	struct cma_region *reg;
+	int ret;
+
+	/* Root */
+	ret = kobject_init_and_add(&root, &cma_sysfs_root_type,
+				   mm_kobj, "contiguous");
+	if (unlikely(ret < 0)) {
+		pr_err("init: unable to add root kobject: %d\n", ret);
+		return ret;
+	}
+
+	/* Regions */
+	ret = kobject_init_and_add(&cma_sysfs_regions, &fake_type,
+				   &root, "regions");
+	if (unlikely(ret < 0)) {
+		pr_err("init: unable to add regions kobject: %d\n", ret);
+		return ret;
+	}
+
+	mutex_lock(&cma_mutex);
+	cma_sysfs_regions_ready = 1;
+	cma_foreach_region(reg)
+		__cma_sysfs_region_add(reg);
+	mutex_unlock(&cma_mutex);
+
+	return 0;
+}
+device_initcall(cma_sysfs_init);
+
+
+
+struct cma_region_attribute {
+	struct attribute attr;
+	ssize_t (*show)(struct cma_region *reg, char *buf);
+	int (*store)(struct cma_region *reg, const char *buf);
+};
+
+
+static ssize_t cma_sysfs_region_name_show(struct cma_region *reg, char *page)
+{
+	return reg->name ? snprintf(page, PAGE_SIZE, "%s\n", reg->name) : 0;
+}
+
+static ssize_t cma_sysfs_region_start_show(struct cma_region *reg, char *page)
+{
+	return snprintf(page, PAGE_SIZE, "%p\n", (void *)reg->start);
+}
+
+static ssize_t cma_sysfs_region_size_show(struct cma_region *reg, char *page)
+{
+	return snprintf(page, PAGE_SIZE, "%zu\n", reg->size);
+}
+
+static ssize_t cma_sysfs_region_free_show(struct cma_region *reg, char *page)
+{
+	return snprintf(page, PAGE_SIZE, "%zu\n", reg->free_space);
+}
+
+static ssize_t cma_sysfs_region_users_show(struct cma_region *reg, char *page)
+{
+	return snprintf(page, PAGE_SIZE, "%u\n", reg->users);
+}
+
+static ssize_t cma_sysfs_region_alloc_show(struct cma_region *reg, char *page)
+{
+	if (reg->alloc)
+		return snprintf(page, PAGE_SIZE, "%s\n",
+				reg->alloc->name ?: "-");
+	else if (reg->alloc_name)
+		return snprintf(page, PAGE_SIZE, "[%s]\n", reg->alloc_name);
+	else
+		return 0;
+}
+
+static int
+cma_sysfs_region_alloc_store(struct cma_region *reg, const char *page)
+{
+	char *s;
+
+	if (reg->alloc && reg->users)
+		return -EBUSY;
+
+	if (!*page || *page == '\n') {
+		s = NULL;
+	} else {
+		size_t len;
+
+		for (s = (char *)page; *++s && *s != '\n'; )
+			/* nop */;
+
+		len = s - page;
+		s = kmemdup(page, len + 1, GFP_KERNEL);
+		if (!s)
+			return -ENOMEM;
+		s[len] = '\0';
+	}
+
+	if (reg->alloc)
+		__cma_region_detach_alloc(reg);
+
+	if (reg->free_alloc_name)
+		kfree(reg->alloc_name);
+
+	reg->alloc_name = s;
+	reg->free_alloc_name = !!s;
+
+	return 0;
+}
+
+
+static ssize_t
+cma_sysfs_region_show(struct kobject *kobj, struct attribute *attr,
+		      char *buf)
+{
+	struct cma_region *reg = container_of(kobj, struct cma_region, kobj);
+	struct cma_region_attribute *rattr =
+		container_of(attr, struct cma_region_attribute, attr);
+	ssize_t ret;
+
+	mutex_lock(&cma_mutex);
+	ret = rattr->show(reg, buf);
+	mutex_unlock(&cma_mutex);
+
+	return ret;
+}
+
+static int
+cma_sysfs_region_store(struct kobject *kobj, struct attribute *attr,
+		       const char *buf, size_t count)
+{
+	struct cma_region *reg = container_of(kobj, struct cma_region, kobj);
+	struct cma_region_attribute *rattr =
+		container_of(attr, struct cma_region_attribute, attr);
+	int ret;
+
+	mutex_lock(&cma_mutex);
+	ret = rattr->store(reg, buf);
+	mutex_unlock(&cma_mutex);
+
+	return ret < 0 ? ret : count;
+}
+
+static struct kobj_type cma_sysfs_region_type = {
+	.sysfs_ops	= &(const struct sysfs_ops){
+		.show	= cma_sysfs_region_show,
+		.store	= cma_sysfs_region_store,
+	},
+	.default_attrs	= (struct attribute * []) {
+		CMA_ATTR_RO_INLINE(region, name),
+		CMA_ATTR_RO_INLINE(region, start),
+		CMA_ATTR_RO_INLINE(region, size),
+		CMA_ATTR_RO_INLINE(region, free),
+		CMA_ATTR_RO_INLINE(region, users),
+		CMA_ATTR_INLINE(region, alloc),
+		NULL
+	},
+};
+
+static void __cma_sysfs_region_add(struct cma_region *reg)
+{
+	int ret;
+
+	if (!cma_sysfs_regions_ready)
+		return;
+
+	memset(&reg->kobj, 0, sizeof reg->kobj);
+
+	ret = kobject_init_and_add(&reg->kobj, &cma_sysfs_region_type,
+				   &cma_sysfs_regions,
+				   "%p", (void *)reg->start);
+
+	if (reg->name &&
+	    sysfs_create_link(&cma_sysfs_regions, &reg->kobj, reg->name) < 0)
+		/* Ignore any errors. */;
+}
+
+#else
+
+static void __cma_sysfs_region_add(struct cma_region *reg)
+{
+	/* nop */
+}
+
+#endif
+
+
 /************************* Chunks *************************/
 
 /* All chunks sorted by start address. */
@@ -785,6 +1114,18 @@ static int __cma_region_attach_alloc(struct cma_region *reg)
 	return ret;
 }
 
+static void __cma_region_detach_alloc(struct cma_region *reg)
+{
+	if (!reg->alloc)
+		return;
+
+	if (reg->alloc->cleanup)
+		reg->alloc->cleanup(reg);
+
+	reg->alloc = NULL;
+	reg->used = 1;
+}
+
 
 /*
  * s            ::= rules
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 4/9] mm: cma: Added command line parameters support
  2010-09-06  6:33 ` Michal Nazarewicz
  (?)
@ 2010-09-06  6:33   ` Michal Nazarewicz
  -1 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

This patch adds a pair of early parameters ("cma" and
"cma.map") which let one override the CMA configuration
given by platform without the need to recompile the kernel.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 Documentation/contiguous-memory.txt |   85 ++++++++++++++++++++++--
 Documentation/kernel-parameters.txt |    7 ++
 mm/Kconfig                          |    9 +++
 mm/cma.c                            |  125 +++++++++++++++++++++++++++++++++++
 4 files changed, 221 insertions(+), 5 deletions(-)

diff --git a/Documentation/contiguous-memory.txt b/Documentation/contiguous-memory.txt
index 15aff7a..3d9d42c 100644
--- a/Documentation/contiguous-memory.txt
+++ b/Documentation/contiguous-memory.txt
@@ -88,6 +88,20 @@
            early region and the framework will handle the rest
            including choosing the right early allocator.
 
+    4. CMA allows a run-time configuration of the memory regions it
+       will use to allocate chunks of memory from.  The set of memory
+       regions is given on command line so it can be easily changed
+       without the need for recompiling the kernel.
+
+       Each region has it's own size, alignment demand, a start
+       address (physical address where it should be placed) and an
+       allocator algorithm assigned to the region.
+
+       This means that there can be different algorithms running at
+       the same time, if different devices on the platform have
+       distinct memory usage characteristics and different algorithm
+       match those the best way.
+
 ** Use cases
 
     Let's analyse some imaginary system that uses the CMA to see how
@@ -162,7 +176,6 @@
     This solution also shows how with CMA you can assign private pools
     of memory to each device if that is required.
 
-
     Allocation mechanisms can be replaced dynamically in a similar
     manner as well. Let's say that during testing, it has been
     discovered that, for a given shared region of 40 MiB,
@@ -217,6 +230,42 @@
      it will be set to a PAGE_SIZE.  start will be aligned to
      alignment.
 
+     If command line parameter support is enabled, this attribute can
+     also be overriden by a command line "cma" parameter.  When given
+     on command line its forrmat is as follows:
+
+         regions-attr  ::= [ regions [ ';' ] ]
+         regions       ::= region [ ';' regions ]
+
+         region        ::= REG-NAME
+                             '=' size
+                           [ '@' start ]
+                           [ '/' alignment ]
+                           [ ':' ALLOC-NAME ]
+
+         size          ::= MEMSIZE   // size of the region
+         start         ::= MEMSIZE   // desired start address of
+                                     // the region
+         alignment     ::= MEMSIZE   // alignment of the start
+                                     // address of the region
+
+     REG-NAME specifies the name of the region.  All regions given at
+     via the regions attribute need to have a name.  Moreover, all
+     regions need to have a unique name.  If two regions have the same
+     name it is unspecified which will be used when requesting to
+     allocate memory from region with given name.
+
+     ALLOC-NAME specifies the name of allocator to be used with the
+     region.  If no allocator name is provided, the "default"
+     allocator will be used with the region.  The "default" allocator
+     is, of course, the first allocator that has been registered. ;)
+
+     size, start and alignment are specified in bytes with suffixes
+     that memparse() accept.  If start is given, the region will be
+     reserved on given starting address (or at close to it as
+     possible).  If alignment is specified, the region will be aligned
+     to given value.
+
 **** Map
 
      The format of the "map" attribute is as follows:
@@ -260,8 +309,33 @@
      SysFS and can be changed at run-time by writing to
      /sys/kernel/mm/contiguous/map.
 
+     If command line parameter support is enabled, this attribute can
+     also be overriden by a command line "cma.map" parameter.
+
+**** Examples
+
      Some examples (whitespace added for better readability):
 
+         cma = r1 = 64M       // 64M region
+                    @512M       // starting at address 512M
+                                // (or at least as near as possible)
+                    /1M         // make sure it's aligned to 1M
+                    :foo(bar);  // uses allocator "foo" with "bar"
+                                // as parameters for it
+               r2 = 64M       // 64M region
+                    /1M;        // make sure it's aligned to 1M
+                                // uses the first available allocator
+               r3 = 64M       // 64M region
+                    @512M       // starting at address 512M
+                    :foo;       // uses allocator "foo" with no parameters
+
+         cma_map = foo = r1;
+                       // device foo with kind==NULL uses region r1
+
+                   foo/quaz = r2;  // OR:
+                   /quaz = r2;
+                       // device foo with kind == "quaz" uses region r2
+
          cma_map = foo/quaz = r1;
                        // device foo with type == "quaz" uses region r1
 
@@ -529,10 +603,11 @@
 
         int cma_set_defaults(struct cma_region *regions, const char *map)
 
-    It needs to be called prior to reserving regions.  It let one
-    specify the list of regions defined by platform and the map
-    attribute.  The map may point to a string in __initdata.  See
-    above in this document for example usage of this function.
+    It needs to be called after early params have been parsed but
+    prior to reserving regions.  It let one specify the list of
+    regions defined by platform and the map attribute.  The map may
+    point to a string in __initdata.  See above in this document for
+    example usage of this function.
 
 ** Future work
 
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index b7eb33f..015e458 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -43,6 +43,7 @@ parameter is applicable:
 	AVR32	AVR32 architecture is enabled.
 	AX25	Appropriate AX.25 support is enabled.
 	BLACKFIN Blackfin architecture is enabled.
+	CMA	Contiguous Memory Allocator is enabled.
 	EDD	BIOS Enhanced Disk Drive Services (EDD) is enabled
 	EFI	EFI Partitioning (GPT) is enabled
 	EIDE	EIDE/ATAPI support is enabled.
@@ -478,6 +479,12 @@ and is between 256 and 4096 characters. It is defined in the file
 			Also note the kernel might malfunction if you disable
 			some critical bits.
 
+	cma=		[CMA] List of CMA regions.
+			See Documentation/contiguous-memory.txt for details.
+
+	cma.map=	[CMA] CMA mapping
+			See Documentation/contiguous-memory.txt for details.
+
 	cmo_free_hint=	[PPC] Format: { yes | no }
 			Specify whether pages are marked as being inactive
 			when they are freed.  This is used in CMO environments
diff --git a/mm/Kconfig b/mm/Kconfig
index 8bed799..b410910 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -353,6 +353,15 @@ config CMA_SYSFS
 	  <Documentation/contiguous-memory.txt> and
 	  <Documentation/ABI/testing/sysfs-kernel-mm-contiguous> files.
 
+config CMA_CMDLINE
+	bool "CMA command line parameters support"
+	depends on CMA_DEVELOPEMENT
+	help
+	  Enable support for cma and cma.map command line parameters.
+	  This lets overwrite the CMA defaults defined by for the
+	  platform.  This should only be usable during development and
+	  testing.
+
 config CMA_BEST_FIT
 	bool "CMA best-fit allocator"
 	depends on CMA
diff --git a/mm/cma.c b/mm/cma.c
index 955f08c..8191c97 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -103,6 +103,12 @@ static int __init cma_map_param(char *param)
 	return 0;
 }
 
+#if defined CONFIG_CMA_CMDLINE
+
+early_param("cma.map", cma_map_param);
+
+#endif
+
 
 
 /************************* Early regions *************************/
@@ -110,6 +116,125 @@ static int __init cma_map_param(char *param)
 struct list_head cma_early_regions __initdata =
 	LIST_HEAD_INIT(cma_early_regions);
 
+#ifdef CONFIG_CMA_CMDLINE
+
+/*
+ * regions-attr ::= [ regions [ ';' ] ]
+ * regions      ::= region [ ';' regions ]
+ *
+ * region       ::= [ '-' ] reg-name
+ *                    '=' size
+ *                  [ '@' start ]
+ *                  [ '/' alignment ]
+ *                  [ ':' alloc-name ]
+ *
+ * See Documentation/contiguous-memory.txt for details.
+ *
+ * Example:
+ * cma=reg1=64M:bf;reg2=32M@0x100000:bf;reg3=64M/1M:bf
+ *
+ * If allocator is ommited the first available allocater will be used.
+ */
+
+#define NUMPARSE(cond_ch, type, cond) ({				\
+		unsigned long long v = 0;				\
+		if (*param == (cond_ch)) {				\
+			const char *const msg = param + 1;		\
+			v = memparse(msg, &param);			\
+			if (!v || v > ~(type)0 || !(cond)) {		\
+				pr_err("param: invalid value near %s\n", msg); \
+				ret = -EINVAL;				\
+				break;					\
+			}						\
+		}							\
+		v;							\
+	})
+
+static int __init cma_param_parse(char *param)
+{
+	static struct cma_region regions[16];
+
+	size_t left = ARRAY_SIZE(regions);
+	struct cma_region *reg = regions;
+	int ret = 0;
+
+	pr_debug("param: %s\n", param);
+
+	for (; *param; ++reg) {
+		dma_addr_t start, alignment;
+		size_t size;
+
+		if (unlikely(!--left)) {
+			pr_err("param: too many early regions\n");
+			return -ENOSPC;
+		}
+
+		/* Parse name */
+		reg->name = param;
+		param = strchr(param, '=');
+		if (!param || param == reg->name) {
+			pr_err("param: expected \"<name>=\" near %s\n",
+			       reg->name);
+			ret = -EINVAL;
+			break;
+		}
+		*param = '\0';
+
+		/* Parse numbers */
+		size      = NUMPARSE('\0', size_t, true);
+		start     = NUMPARSE('@', dma_addr_t, true);
+		alignment = NUMPARSE('/', dma_addr_t, (v & (v - 1)) == 0);
+
+		alignment = max(alignment, (dma_addr_t)PAGE_SIZE);
+		start     = ALIGN(start, alignment);
+		size      = PAGE_ALIGN(size);
+		if (start + size < start) {
+			pr_err("param: invalid start, size combination\n");
+			ret = -EINVAL;
+			break;
+		}
+
+		/* Parse allocator */
+		if (*param == ':') {
+			reg->alloc_name = ++param;
+			while (*param && *param != ';')
+				++param;
+			if (param == reg->alloc_name)
+				reg->alloc_name = NULL;
+		}
+
+		/* Go to next */
+		if (*param == ';') {
+			*param = '\0';
+			++param;
+		} else if (*param) {
+			pr_err("param: expecting ';' or end of parameter near %s\n",
+			       param);
+			ret = -EINVAL;
+			break;
+		}
+
+		/* Add */
+		reg->size      = size;
+		reg->start     = start;
+		reg->alignment = alignment;
+		reg->copy_name = 1;
+
+		list_add_tail(&reg->list, &cma_early_regions);
+
+		pr_debug("param: registering early region %s (%p@%p/%p)\n",
+			 reg->name, (void *)reg->size, (void *)reg->start,
+			 (void *)reg->alignment);
+	}
+
+	return ret;
+}
+early_param("cma", cma_param_parse);
+
+#undef NUMPARSE
+
+#endif
+
 
 int __init __must_check cma_early_region_register(struct cma_region *reg)
 {
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 4/9] mm: cma: Added command line parameters support
@ 2010-09-06  6:33   ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

This patch adds a pair of early parameters ("cma" and
"cma.map") which let one override the CMA configuration
given by platform without the need to recompile the kernel.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 Documentation/contiguous-memory.txt |   85 ++++++++++++++++++++++--
 Documentation/kernel-parameters.txt |    7 ++
 mm/Kconfig                          |    9 +++
 mm/cma.c                            |  125 +++++++++++++++++++++++++++++++++++
 4 files changed, 221 insertions(+), 5 deletions(-)

diff --git a/Documentation/contiguous-memory.txt b/Documentation/contiguous-memory.txt
index 15aff7a..3d9d42c 100644
--- a/Documentation/contiguous-memory.txt
+++ b/Documentation/contiguous-memory.txt
@@ -88,6 +88,20 @@
            early region and the framework will handle the rest
            including choosing the right early allocator.
 
+    4. CMA allows a run-time configuration of the memory regions it
+       will use to allocate chunks of memory from.  The set of memory
+       regions is given on command line so it can be easily changed
+       without the need for recompiling the kernel.
+
+       Each region has it's own size, alignment demand, a start
+       address (physical address where it should be placed) and an
+       allocator algorithm assigned to the region.
+
+       This means that there can be different algorithms running at
+       the same time, if different devices on the platform have
+       distinct memory usage characteristics and different algorithm
+       match those the best way.
+
 ** Use cases
 
     Let's analyse some imaginary system that uses the CMA to see how
@@ -162,7 +176,6 @@
     This solution also shows how with CMA you can assign private pools
     of memory to each device if that is required.
 
-
     Allocation mechanisms can be replaced dynamically in a similar
     manner as well. Let's say that during testing, it has been
     discovered that, for a given shared region of 40 MiB,
@@ -217,6 +230,42 @@
      it will be set to a PAGE_SIZE.  start will be aligned to
      alignment.
 
+     If command line parameter support is enabled, this attribute can
+     also be overriden by a command line "cma" parameter.  When given
+     on command line its forrmat is as follows:
+
+         regions-attr  ::= [ regions [ ';' ] ]
+         regions       ::= region [ ';' regions ]
+
+         region        ::= REG-NAME
+                             '=' size
+                           [ '@' start ]
+                           [ '/' alignment ]
+                           [ ':' ALLOC-NAME ]
+
+         size          ::= MEMSIZE   // size of the region
+         start         ::= MEMSIZE   // desired start address of
+                                     // the region
+         alignment     ::= MEMSIZE   // alignment of the start
+                                     // address of the region
+
+     REG-NAME specifies the name of the region.  All regions given at
+     via the regions attribute need to have a name.  Moreover, all
+     regions need to have a unique name.  If two regions have the same
+     name it is unspecified which will be used when requesting to
+     allocate memory from region with given name.
+
+     ALLOC-NAME specifies the name of allocator to be used with the
+     region.  If no allocator name is provided, the "default"
+     allocator will be used with the region.  The "default" allocator
+     is, of course, the first allocator that has been registered. ;)
+
+     size, start and alignment are specified in bytes with suffixes
+     that memparse() accept.  If start is given, the region will be
+     reserved on given starting address (or at close to it as
+     possible).  If alignment is specified, the region will be aligned
+     to given value.
+
 **** Map
 
      The format of the "map" attribute is as follows:
@@ -260,8 +309,33 @@
      SysFS and can be changed at run-time by writing to
      /sys/kernel/mm/contiguous/map.
 
+     If command line parameter support is enabled, this attribute can
+     also be overriden by a command line "cma.map" parameter.
+
+**** Examples
+
      Some examples (whitespace added for better readability):
 
+         cma = r1 = 64M       // 64M region
+                    @512M       // starting at address 512M
+                                // (or at least as near as possible)
+                    /1M         // make sure it's aligned to 1M
+                    :foo(bar);  // uses allocator "foo" with "bar"
+                                // as parameters for it
+               r2 = 64M       // 64M region
+                    /1M;        // make sure it's aligned to 1M
+                                // uses the first available allocator
+               r3 = 64M       // 64M region
+                    @512M       // starting at address 512M
+                    :foo;       // uses allocator "foo" with no parameters
+
+         cma_map = foo = r1;
+                       // device foo with kind==NULL uses region r1
+
+                   foo/quaz = r2;  // OR:
+                   /quaz = r2;
+                       // device foo with kind == "quaz" uses region r2
+
          cma_map = foo/quaz = r1;
                        // device foo with type == "quaz" uses region r1
 
@@ -529,10 +603,11 @@
 
         int cma_set_defaults(struct cma_region *regions, const char *map)
 
-    It needs to be called prior to reserving regions.  It let one
-    specify the list of regions defined by platform and the map
-    attribute.  The map may point to a string in __initdata.  See
-    above in this document for example usage of this function.
+    It needs to be called after early params have been parsed but
+    prior to reserving regions.  It let one specify the list of
+    regions defined by platform and the map attribute.  The map may
+    point to a string in __initdata.  See above in this document for
+    example usage of this function.
 
 ** Future work
 
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index b7eb33f..015e458 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -43,6 +43,7 @@ parameter is applicable:
 	AVR32	AVR32 architecture is enabled.
 	AX25	Appropriate AX.25 support is enabled.
 	BLACKFIN Blackfin architecture is enabled.
+	CMA	Contiguous Memory Allocator is enabled.
 	EDD	BIOS Enhanced Disk Drive Services (EDD) is enabled
 	EFI	EFI Partitioning (GPT) is enabled
 	EIDE	EIDE/ATAPI support is enabled.
@@ -478,6 +479,12 @@ and is between 256 and 4096 characters. It is defined in the file
 			Also note the kernel might malfunction if you disable
 			some critical bits.
 
+	cma=		[CMA] List of CMA regions.
+			See Documentation/contiguous-memory.txt for details.
+
+	cma.map=	[CMA] CMA mapping
+			See Documentation/contiguous-memory.txt for details.
+
 	cmo_free_hint=	[PPC] Format: { yes | no }
 			Specify whether pages are marked as being inactive
 			when they are freed.  This is used in CMO environments
diff --git a/mm/Kconfig b/mm/Kconfig
index 8bed799..b410910 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -353,6 +353,15 @@ config CMA_SYSFS
 	  <Documentation/contiguous-memory.txt> and
 	  <Documentation/ABI/testing/sysfs-kernel-mm-contiguous> files.
 
+config CMA_CMDLINE
+	bool "CMA command line parameters support"
+	depends on CMA_DEVELOPEMENT
+	help
+	  Enable support for cma and cma.map command line parameters.
+	  This lets overwrite the CMA defaults defined by for the
+	  platform.  This should only be usable during development and
+	  testing.
+
 config CMA_BEST_FIT
 	bool "CMA best-fit allocator"
 	depends on CMA
diff --git a/mm/cma.c b/mm/cma.c
index 955f08c..8191c97 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -103,6 +103,12 @@ static int __init cma_map_param(char *param)
 	return 0;
 }
 
+#if defined CONFIG_CMA_CMDLINE
+
+early_param("cma.map", cma_map_param);
+
+#endif
+
 
 
 /************************* Early regions *************************/
@@ -110,6 +116,125 @@ static int __init cma_map_param(char *param)
 struct list_head cma_early_regions __initdata =
 	LIST_HEAD_INIT(cma_early_regions);
 
+#ifdef CONFIG_CMA_CMDLINE
+
+/*
+ * regions-attr ::= [ regions [ ';' ] ]
+ * regions      ::= region [ ';' regions ]
+ *
+ * region       ::= [ '-' ] reg-name
+ *                    '=' size
+ *                  [ '@' start ]
+ *                  [ '/' alignment ]
+ *                  [ ':' alloc-name ]
+ *
+ * See Documentation/contiguous-memory.txt for details.
+ *
+ * Example:
+ * cma=reg1=64M:bf;reg2=32M@0x100000:bf;reg3=64M/1M:bf
+ *
+ * If allocator is ommited the first available allocater will be used.
+ */
+
+#define NUMPARSE(cond_ch, type, cond) ({				\
+		unsigned long long v = 0;				\
+		if (*param == (cond_ch)) {				\
+			const char *const msg = param + 1;		\
+			v = memparse(msg, &param);			\
+			if (!v || v > ~(type)0 || !(cond)) {		\
+				pr_err("param: invalid value near %s\n", msg); \
+				ret = -EINVAL;				\
+				break;					\
+			}						\
+		}							\
+		v;							\
+	})
+
+static int __init cma_param_parse(char *param)
+{
+	static struct cma_region regions[16];
+
+	size_t left = ARRAY_SIZE(regions);
+	struct cma_region *reg = regions;
+	int ret = 0;
+
+	pr_debug("param: %s\n", param);
+
+	for (; *param; ++reg) {
+		dma_addr_t start, alignment;
+		size_t size;
+
+		if (unlikely(!--left)) {
+			pr_err("param: too many early regions\n");
+			return -ENOSPC;
+		}
+
+		/* Parse name */
+		reg->name = param;
+		param = strchr(param, '=');
+		if (!param || param == reg->name) {
+			pr_err("param: expected \"<name>=\" near %s\n",
+			       reg->name);
+			ret = -EINVAL;
+			break;
+		}
+		*param = '\0';
+
+		/* Parse numbers */
+		size      = NUMPARSE('\0', size_t, true);
+		start     = NUMPARSE('@', dma_addr_t, true);
+		alignment = NUMPARSE('/', dma_addr_t, (v & (v - 1)) == 0);
+
+		alignment = max(alignment, (dma_addr_t)PAGE_SIZE);
+		start     = ALIGN(start, alignment);
+		size      = PAGE_ALIGN(size);
+		if (start + size < start) {
+			pr_err("param: invalid start, size combination\n");
+			ret = -EINVAL;
+			break;
+		}
+
+		/* Parse allocator */
+		if (*param == ':') {
+			reg->alloc_name = ++param;
+			while (*param && *param != ';')
+				++param;
+			if (param == reg->alloc_name)
+				reg->alloc_name = NULL;
+		}
+
+		/* Go to next */
+		if (*param == ';') {
+			*param = '\0';
+			++param;
+		} else if (*param) {
+			pr_err("param: expecting ';' or end of parameter near %s\n",
+			       param);
+			ret = -EINVAL;
+			break;
+		}
+
+		/* Add */
+		reg->size      = size;
+		reg->start     = start;
+		reg->alignment = alignment;
+		reg->copy_name = 1;
+
+		list_add_tail(&reg->list, &cma_early_regions);
+
+		pr_debug("param: registering early region %s (%p@%p/%p)\n",
+			 reg->name, (void *)reg->size, (void *)reg->start,
+			 (void *)reg->alignment);
+	}
+
+	return ret;
+}
+early_param("cma", cma_param_parse);
+
+#undef NUMPARSE
+
+#endif
+
 
 int __init __must_check cma_early_region_register(struct cma_region *reg)
 {
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 4/9] mm: cma: Added command line parameters support
@ 2010-09-06  6:33   ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds a pair of early parameters ("cma" and
"cma.map") which let one override the CMA configuration
given by platform without the need to recompile the kernel.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 Documentation/contiguous-memory.txt |   85 ++++++++++++++++++++++--
 Documentation/kernel-parameters.txt |    7 ++
 mm/Kconfig                          |    9 +++
 mm/cma.c                            |  125 +++++++++++++++++++++++++++++++++++
 4 files changed, 221 insertions(+), 5 deletions(-)

diff --git a/Documentation/contiguous-memory.txt b/Documentation/contiguous-memory.txt
index 15aff7a..3d9d42c 100644
--- a/Documentation/contiguous-memory.txt
+++ b/Documentation/contiguous-memory.txt
@@ -88,6 +88,20 @@
            early region and the framework will handle the rest
            including choosing the right early allocator.
 
+    4. CMA allows a run-time configuration of the memory regions it
+       will use to allocate chunks of memory from.  The set of memory
+       regions is given on command line so it can be easily changed
+       without the need for recompiling the kernel.
+
+       Each region has it's own size, alignment demand, a start
+       address (physical address where it should be placed) and an
+       allocator algorithm assigned to the region.
+
+       This means that there can be different algorithms running at
+       the same time, if different devices on the platform have
+       distinct memory usage characteristics and different algorithm
+       match those the best way.
+
 ** Use cases
 
     Let's analyse some imaginary system that uses the CMA to see how
@@ -162,7 +176,6 @@
     This solution also shows how with CMA you can assign private pools
     of memory to each device if that is required.
 
-
     Allocation mechanisms can be replaced dynamically in a similar
     manner as well. Let's say that during testing, it has been
     discovered that, for a given shared region of 40 MiB,
@@ -217,6 +230,42 @@
      it will be set to a PAGE_SIZE.  start will be aligned to
      alignment.
 
+     If command line parameter support is enabled, this attribute can
+     also be overriden by a command line "cma" parameter.  When given
+     on command line its forrmat is as follows:
+
+         regions-attr  ::= [ regions [ ';' ] ]
+         regions       ::= region [ ';' regions ]
+
+         region        ::= REG-NAME
+                             '=' size
+                           [ '@' start ]
+                           [ '/' alignment ]
+                           [ ':' ALLOC-NAME ]
+
+         size          ::= MEMSIZE   // size of the region
+         start         ::= MEMSIZE   // desired start address of
+                                     // the region
+         alignment     ::= MEMSIZE   // alignment of the start
+                                     // address of the region
+
+     REG-NAME specifies the name of the region.  All regions given at
+     via the regions attribute need to have a name.  Moreover, all
+     regions need to have a unique name.  If two regions have the same
+     name it is unspecified which will be used when requesting to
+     allocate memory from region with given name.
+
+     ALLOC-NAME specifies the name of allocator to be used with the
+     region.  If no allocator name is provided, the "default"
+     allocator will be used with the region.  The "default" allocator
+     is, of course, the first allocator that has been registered. ;)
+
+     size, start and alignment are specified in bytes with suffixes
+     that memparse() accept.  If start is given, the region will be
+     reserved on given starting address (or at close to it as
+     possible).  If alignment is specified, the region will be aligned
+     to given value.
+
 **** Map
 
      The format of the "map" attribute is as follows:
@@ -260,8 +309,33 @@
      SysFS and can be changed at run-time by writing to
      /sys/kernel/mm/contiguous/map.
 
+     If command line parameter support is enabled, this attribute can
+     also be overriden by a command line "cma.map" parameter.
+
+**** Examples
+
      Some examples (whitespace added for better readability):
 
+         cma = r1 = 64M       // 64M region
+                    @512M       // starting at address 512M
+                                // (or at least as near as possible)
+                    /1M         // make sure it's aligned to 1M
+                    :foo(bar);  // uses allocator "foo" with "bar"
+                                // as parameters for it
+               r2 = 64M       // 64M region
+                    /1M;        // make sure it's aligned to 1M
+                                // uses the first available allocator
+               r3 = 64M       // 64M region
+                    @512M       // starting at address 512M
+                    :foo;       // uses allocator "foo" with no parameters
+
+         cma_map = foo = r1;
+                       // device foo with kind==NULL uses region r1
+
+                   foo/quaz = r2;  // OR:
+                   /quaz = r2;
+                       // device foo with kind == "quaz" uses region r2
+
          cma_map = foo/quaz = r1;
                        // device foo with type == "quaz" uses region r1
 
@@ -529,10 +603,11 @@
 
         int cma_set_defaults(struct cma_region *regions, const char *map)
 
-    It needs to be called prior to reserving regions.  It let one
-    specify the list of regions defined by platform and the map
-    attribute.  The map may point to a string in __initdata.  See
-    above in this document for example usage of this function.
+    It needs to be called after early params have been parsed but
+    prior to reserving regions.  It let one specify the list of
+    regions defined by platform and the map attribute.  The map may
+    point to a string in __initdata.  See above in this document for
+    example usage of this function.
 
 ** Future work
 
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index b7eb33f..015e458 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -43,6 +43,7 @@ parameter is applicable:
 	AVR32	AVR32 architecture is enabled.
 	AX25	Appropriate AX.25 support is enabled.
 	BLACKFIN Blackfin architecture is enabled.
+	CMA	Contiguous Memory Allocator is enabled.
 	EDD	BIOS Enhanced Disk Drive Services (EDD) is enabled
 	EFI	EFI Partitioning (GPT) is enabled
 	EIDE	EIDE/ATAPI support is enabled.
@@ -478,6 +479,12 @@ and is between 256 and 4096 characters. It is defined in the file
 			Also note the kernel might malfunction if you disable
 			some critical bits.
 
+	cma=		[CMA] List of CMA regions.
+			See Documentation/contiguous-memory.txt for details.
+
+	cma.map=	[CMA] CMA mapping
+			See Documentation/contiguous-memory.txt for details.
+
 	cmo_free_hint=	[PPC] Format: { yes | no }
 			Specify whether pages are marked as being inactive
 			when they are freed.  This is used in CMO environments
diff --git a/mm/Kconfig b/mm/Kconfig
index 8bed799..b410910 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -353,6 +353,15 @@ config CMA_SYSFS
 	  <Documentation/contiguous-memory.txt> and
 	  <Documentation/ABI/testing/sysfs-kernel-mm-contiguous> files.
 
+config CMA_CMDLINE
+	bool "CMA command line parameters support"
+	depends on CMA_DEVELOPEMENT
+	help
+	  Enable support for cma and cma.map command line parameters.
+	  This lets overwrite the CMA defaults defined by for the
+	  platform.  This should only be usable during development and
+	  testing.
+
 config CMA_BEST_FIT
 	bool "CMA best-fit allocator"
 	depends on CMA
diff --git a/mm/cma.c b/mm/cma.c
index 955f08c..8191c97 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -103,6 +103,12 @@ static int __init cma_map_param(char *param)
 	return 0;
 }
 
+#if defined CONFIG_CMA_CMDLINE
+
+early_param("cma.map", cma_map_param);
+
+#endif
+
 
 
 /************************* Early regions *************************/
@@ -110,6 +116,125 @@ static int __init cma_map_param(char *param)
 struct list_head cma_early_regions __initdata =
 	LIST_HEAD_INIT(cma_early_regions);
 
+#ifdef CONFIG_CMA_CMDLINE
+
+/*
+ * regions-attr ::= [ regions [ ';' ] ]
+ * regions      ::= region [ ';' regions ]
+ *
+ * region       ::= [ '-' ] reg-name
+ *                    '=' size
+ *                  [ '@' start ]
+ *                  [ '/' alignment ]
+ *                  [ ':' alloc-name ]
+ *
+ * See Documentation/contiguous-memory.txt for details.
+ *
+ * Example:
+ * cma=reg1=64M:bf;reg2=32M at 0x100000:bf;reg3=64M/1M:bf
+ *
+ * If allocator is ommited the first available allocater will be used.
+ */
+
+#define NUMPARSE(cond_ch, type, cond) ({				\
+		unsigned long long v = 0;				\
+		if (*param == (cond_ch)) {				\
+			const char *const msg = param + 1;		\
+			v = memparse(msg, &param);			\
+			if (!v || v > ~(type)0 || !(cond)) {		\
+				pr_err("param: invalid value near %s\n", msg); \
+				ret = -EINVAL;				\
+				break;					\
+			}						\
+		}							\
+		v;							\
+	})
+
+static int __init cma_param_parse(char *param)
+{
+	static struct cma_region regions[16];
+
+	size_t left = ARRAY_SIZE(regions);
+	struct cma_region *reg = regions;
+	int ret = 0;
+
+	pr_debug("param: %s\n", param);
+
+	for (; *param; ++reg) {
+		dma_addr_t start, alignment;
+		size_t size;
+
+		if (unlikely(!--left)) {
+			pr_err("param: too many early regions\n");
+			return -ENOSPC;
+		}
+
+		/* Parse name */
+		reg->name = param;
+		param = strchr(param, '=');
+		if (!param || param == reg->name) {
+			pr_err("param: expected \"<name>=\" near %s\n",
+			       reg->name);
+			ret = -EINVAL;
+			break;
+		}
+		*param = '\0';
+
+		/* Parse numbers */
+		size      = NUMPARSE('\0', size_t, true);
+		start     = NUMPARSE('@', dma_addr_t, true);
+		alignment = NUMPARSE('/', dma_addr_t, (v & (v - 1)) == 0);
+
+		alignment = max(alignment, (dma_addr_t)PAGE_SIZE);
+		start     = ALIGN(start, alignment);
+		size      = PAGE_ALIGN(size);
+		if (start + size < start) {
+			pr_err("param: invalid start, size combination\n");
+			ret = -EINVAL;
+			break;
+		}
+
+		/* Parse allocator */
+		if (*param == ':') {
+			reg->alloc_name = ++param;
+			while (*param && *param != ';')
+				++param;
+			if (param == reg->alloc_name)
+				reg->alloc_name = NULL;
+		}
+
+		/* Go to next */
+		if (*param == ';') {
+			*param = '\0';
+			++param;
+		} else if (*param) {
+			pr_err("param: expecting ';' or end of parameter near %s\n",
+			       param);
+			ret = -EINVAL;
+			break;
+		}
+
+		/* Add */
+		reg->size      = size;
+		reg->start     = start;
+		reg->alignment = alignment;
+		reg->copy_name = 1;
+
+		list_add_tail(&reg->list, &cma_early_regions);
+
+		pr_debug("param: registering early region %s (%p@%p/%p)\n",
+			 reg->name, (void *)reg->size, (void *)reg->start,
+			 (void *)reg->alignment);
+	}
+
+	return ret;
+}
+early_param("cma", cma_param_parse);
+
+#undef NUMPARSE
+
+#endif
+
 
 int __init __must_check cma_early_region_register(struct cma_region *reg)
 {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 5/9] mm: cma: Test device and application added
  2010-09-06  6:33 ` Michal Nazarewicz
  (?)
@ 2010-09-06  6:33   ` Michal Nazarewicz
  -1 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

This patch adds a "cma" misc device which lets user space use the
CMA API.  This device is meant for testing.  A testing application
is also provided.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/misc/Kconfig   |    8 +
 drivers/misc/Makefile  |    1 +
 drivers/misc/cma-dev.c |  202 +++++++++++++++++++++++++
 include/linux/cma.h    |   38 +++++
 tools/cma/cma-test.c   |  386 ++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 635 insertions(+), 0 deletions(-)
 create mode 100644 drivers/misc/cma-dev.c
 create mode 100644 tools/cma/cma-test.c

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 2c38d4e..ac1dd45 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -407,4 +407,12 @@ source "drivers/misc/eeprom/Kconfig"
 source "drivers/misc/cb710/Kconfig"
 source "drivers/misc/iwmc3200top/Kconfig"
 
+config CMA_DEVICE
+	tristate "CMA misc device (DEVELOPEMENT)"
+	depends on CMA_DEVELOPEMENT
+	help
+	  The CMA misc device allows allocating contiguous memory areas
+	  from user space.  This is mostly for testing of the CMA
+	  framework.
+
 endif # MISC_DEVICES
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 21b4761..b08844a 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -36,3 +36,4 @@ obj-y				+= cb710/
 obj-$(CONFIG_VMWARE_BALLOON)	+= vmware_balloon.o
 obj-$(CONFIG_ARM_CHARLCD)	+= arm-charlcd.o
 obj-$(CONFIG_PCH_PHUB)		+= pch_phub.o
+obj-$(CONFIG_CMA_DEVICE)	+= cma-dev.o
diff --git a/drivers/misc/cma-dev.c b/drivers/misc/cma-dev.c
new file mode 100644
index 0000000..5ceb432
--- /dev/null
+++ b/drivers/misc/cma-dev.c
@@ -0,0 +1,202 @@
+/*
+ * Contiguous Memory Allocator userspace driver
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+#define pr_fmt(fmt) "cma: " fmt
+
+#ifdef CONFIG_CMA_DEBUG
+#  define DEBUG
+#endif
+
+#include <linux/errno.h>       /* Error numbers */
+#include <linux/err.h>         /* IS_ERR_VALUE() */
+#include <linux/fs.h>          /* struct file */
+#include <linux/mm.h>          /* Memory stuff */
+#include <linux/mman.h>
+#include <linux/slab.h>
+#include <linux/module.h>      /* Standard module stuff */
+#include <linux/device.h>      /* struct device, dev_dbg() */
+#include <linux/types.h>       /* Just to be safe ;) */
+#include <linux/uaccess.h>     /* __copy_{to,from}_user */
+#include <linux/miscdevice.h>  /* misc_register() and company */
+
+#include <linux/cma.h>
+
+static int  cma_file_open(struct inode *inode, struct file *file);
+static int  cma_file_release(struct inode *inode, struct file *file);
+static long cma_file_ioctl(struct file *file, unsigned cmd, unsigned long arg);
+static int  cma_file_mmap(struct file *file, struct vm_area_struct *vma);
+
+
+static struct miscdevice cma_miscdev = {
+	.minor = MISC_DYNAMIC_MINOR,
+	.name  = "cma",
+	.fops  = &(const struct file_operations) {
+		.owner          = THIS_MODULE,
+		.open           = cma_file_open,
+		.release        = cma_file_release,
+		.unlocked_ioctl = cma_file_ioctl,
+		.mmap           = cma_file_mmap,
+	},
+};
+#define cma_dev (cma_miscdev.this_device)
+
+
+#define cma_file_start(file) (((dma_addr_t *)(file)->private_data)[0])
+#define cma_file_size(file)  (((dma_addr_t *)(file)->private_data)[1])
+
+
+static int  cma_file_open(struct inode *inode, struct file *file)
+{
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	file->private_data = NULL;
+
+	return 0;
+}
+
+
+static int  cma_file_release(struct inode *inode, struct file *file)
+{
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	if (file->private_data) {
+		cma_free(cma_file_start(file));
+		kfree(file->private_data);
+	}
+
+	return 0;
+}
+
+
+static long cma_file_ioctl(struct file *file, unsigned cmd, unsigned long arg)
+{
+	struct cma_alloc_request req;
+	unsigned long addr;
+	long ret;
+
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	if (cmd != IOCTL_CMA_ALLOC)
+		return -ENOTTY;
+
+	if (!arg)
+		return -EINVAL;
+
+	if (file->private_data) /* Already allocated */
+		return -EBADFD;
+
+	if (copy_from_user(&req, (void *)arg, sizeof req))
+		return -EFAULT;
+
+	if (req.magic != CMA_MAGIC)
+		return -ENOTTY;
+
+	if (req.type != CMA_REQ_DEV_KIND && req.type != CMA_REQ_FROM_REG)
+		return -EINVAL;
+
+	/* May happen on 32 bit system. */
+	if (req.size > ~(typeof(req.size))0 ||
+	    req.alignment > ~(typeof(req.alignment))0)
+		return -EINVAL;
+
+	if (strnlen(req.spec, sizeof req.spec) >= sizeof req.spec)
+		return -EINVAL;
+
+
+	file->private_data = kmalloc(2 * sizeof(dma_addr_t), GFP_KERNEL);
+	if (!file->private_data)
+		return -ENOSPC;
+
+
+	if (req.type == CMA_REQ_DEV_KIND) {
+		struct device fake_device;
+		char *kind;
+
+		fake_device.init_name = req.spec;
+		fake_device.kobj.name = req.spec;
+
+		kind = strrchr(req.spec, '/');
+		if (kind)
+			*kind++ = '\0';
+
+		addr = cma_alloc(&fake_device, kind, req.size, req.alignment);
+	} else {
+		addr = cma_alloc_from(req.spec, req.size, req.alignment);
+	}
+
+	if (IS_ERR_VALUE(addr)) {
+		ret = addr;
+		goto error_priv;
+	}
+
+
+	if (put_user(addr, (typeof(req.start) *)(arg + offsetof(typeof(req),
+								start)))) {
+		ret = -EFAULT;
+		goto error_put;
+	}
+
+	cma_file_start(file) = addr;
+	cma_file_size(file) = req.size;
+
+	dev_dbg(cma_dev, "allocated %p@%p\n",
+		(void *)(dma_addr_t)req.size, (void *)addr);
+
+	return 0;
+
+error_put:
+	cma_free(addr);
+error_priv:
+	kfree(file->private_data);
+	file->private_data = NULL;
+	return ret;
+}
+
+
+static int  cma_file_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	unsigned long pgoff, offset, length;
+
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	if (!file->private_data)
+		return -EBADFD;
+
+	pgoff  = vma->vm_pgoff;
+	offset = pgoff << PAGE_SHIFT;
+	length = vma->vm_end - vma->vm_start;
+
+	if (offset          >= cma_file_size(file)
+	 || length          >  cma_file_size(file)
+	 || offset + length >  cma_file_size(file))
+		return -ENOSPC;
+
+	return remap_pfn_range(vma, vma->vm_start,
+			       __phys_to_pfn(cma_file_start(file) + offset),
+			       length, vma->vm_page_prot);
+}
+
+
+
+static int __init cma_dev_init(void)
+{
+	int ret = misc_register(&cma_miscdev);
+	pr_debug("miscdev: register returned: %d\n", ret);
+	return ret;
+}
+module_init(cma_dev_init);
+
+static void __exit cma_dev_exit(void)
+{
+	dev_dbg(cma_dev, "deregisterring\n");
+	misc_deregister(&cma_miscdev);
+}
+module_exit(cma_dev_exit);
diff --git a/include/linux/cma.h b/include/linux/cma.h
index d0f41f4..409bae4 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -11,6 +11,44 @@
  * See Documentation/contiguous-memory.txt for details.
  */
 
+#include <linux/ioctl.h>
+#include <linux/types.h>
+
+
+#define CMA_MAGIC (('c' << 24) | ('M' << 16) | ('a' << 8) | 0x42)
+
+enum {
+	CMA_REQ_DEV_KIND,
+	CMA_REQ_FROM_REG
+};
+
+/**
+ * An information about area exportable to user space.
+ * @magic:	must always be CMA_MAGIC.
+ * @type:	type of the request.
+ * @spec:	either "dev/kind\0" or "regions\0" depending on @type.
+ *		In any case, the string must be NUL terminated.
+ *		additionally, in the latter case scanning stops at
+ *		semicolon (';').
+ * @size:	size of the chunk to allocate.
+ * @alignment:	desired alignment of the chunk (must be power of two or zero).
+ * @start:	when ioctl() finishes this stores physical address of the chunk.
+ */
+struct cma_alloc_request {
+	__u32 magic;
+	__u32 type;
+
+	/* __u64 to be compatible accross 32 and 64 bit systems. */
+	__u64 size;
+	__u64 alignment;
+	__u64 start;
+
+	char spec[32];
+};
+
+#define IOCTL_CMA_ALLOC    _IOWR('p', 0, struct cma_alloc_request)
+
+
 /***************************** Kernel level API *****************************/
 
 #ifdef __KERNEL__
diff --git a/tools/cma/cma-test.c b/tools/cma/cma-test.c
new file mode 100644
index 0000000..4aa6c5c
--- /dev/null
+++ b/tools/cma/cma-test.c
@@ -0,0 +1,386 @@
+/*
+ * cma-test.c -- CMA testing application
+ *
+ * Copyright (C) 2010 Samsung Electronics
+ *                    Author: Michal Nazarewicz <m.nazarewicz@samsung.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+/* $(CROSS_COMPILE)gcc -Wall -Wextra -g -o cma-test cma-test.c  */
+
+#include <sys/ioctl.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+
+#include <fcntl.h>
+#include <unistd.h>
+
+#include <ctype.h>
+#include <errno.h>
+#include <limits.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <linux/cma.h>
+
+
+static void handle_command(char *line);
+
+int main(void)
+{
+	unsigned no = 1;
+	char line[1024];
+	int skip = 0;
+
+	fputs("commands:\n"
+	      " l or list                                      list allocated chunks\n"
+	      " a or alloc  <dev>/<kind> <size>[/<alignment>]  allocate chunk\n"
+	      " A or afrom  <regions>    <size>[/<alignment>]  allocate from region(s)\n"
+	      " f or free   [<num>]                            free an chunk\n"
+	      " # ...                                          comment\n"
+	      " <empty line>                                   repeat previous\n"
+	      "\n", stderr);
+
+	while (fgets(line, sizeof line, stdin)) {
+		char *nl = strchr(line, '\n');
+		if (nl) {
+			if (skip) {
+				fprintf(stderr, "cma: %d: line too long\n", no);
+				skip = 0;
+			} else {
+				*nl = '\0';
+				handle_command(line);
+			}
+			++no;
+		} else {
+			skip = 1;
+		}
+	}
+
+	if (skip)
+		fprintf(stderr, "cma: %d: no new line at EOF\n", no);
+	return 0;
+}
+
+
+
+static void cmd_list(char *name, char *line);
+static void cmd_alloc(char *name, char *line);
+static void cmd_alloc_from(char *name, char *line);
+static void cmd_free(char *name, char *line);
+
+static const struct command {
+	const char name[8];
+	void (*handle)(char *name, char *line);
+} commands[] = {
+	{ "list",  cmd_list },
+	{ "l",     cmd_list },
+	{ "alloc", cmd_alloc },
+	{ "a",     cmd_alloc },
+	{ "afrom", cmd_alloc_from },
+	{ "A",     cmd_alloc_from },
+	{ "free",  cmd_free },
+	{ "f",     cmd_free },
+	{ "",      NULL }
+};
+
+
+#define SKIP_SPACE(ch) do { while (isspace(*(ch))) ++(ch); } while (0)
+
+
+static void handle_command(char *line)
+{
+	static char last_line[1024];
+
+	const struct command *cmd;
+	char *name;
+
+	SKIP_SPACE(line);
+	if (*line == '#')
+		return;
+
+	if (!*line)
+		strcpy(line, last_line);
+	else
+		strcpy(last_line, line);
+
+	name = line;
+	while (*line && !isspace(*line))
+		++line;
+
+	if (*line) {
+		*line = '\0';
+		++line;
+	}
+
+	for (cmd = commands; *(cmd->name); ++cmd)
+		if (!strcmp(name, cmd->name)) {
+			cmd->handle(name, line);
+			return;
+		}
+
+	fprintf(stderr, "%s: unknown command\n", name);
+}
+
+
+
+struct chunk {
+	struct chunk *next, *prev;
+	int fd;
+	unsigned long size;
+	unsigned long start;
+};
+
+static struct chunk root = {
+	.next = &root,
+	.prev = &root,
+};
+
+#define for_each(a) for (a = root.next; a != &root; a = a->next)
+
+static struct chunk *chunk_create(const char *prefix);
+static void chunk_destroy(struct chunk *chunk);
+static void chunk_add(struct chunk *chunk);
+
+static int memparse(char *ptr, char **retptr, unsigned long *ret);
+
+
+static void cmd_list(char *name, char *line)
+{
+	struct chunk *chunk;
+
+	(void)name; (void)line;
+
+	for_each(chunk)
+		printf("%3d: %p@%p\n", chunk->fd,
+		       (void *)chunk->size, (void *)chunk->start);
+}
+
+
+static void __cma_alloc(char *name, char *line, int from);
+
+static void cmd_alloc(char *name, char *line)
+{
+	__cma_alloc(name, line, 0);
+}
+
+static void cmd_alloc_from(char *name, char *line)
+{
+	__cma_alloc(name, line, 1);
+}
+
+static void __cma_alloc(char *name, char *line, int from)
+{
+	static const char *what[2] = { "dev/kind", "regions" };
+
+	unsigned long size, alignment = 0;
+	struct cma_alloc_request req;
+	struct chunk *chunk;
+	char *spec;
+	size_t n;
+	int ret;
+
+	SKIP_SPACE(line);
+	if (!*line) {
+		fprintf(stderr, "%s: expecting %s\n", name, what[from]);
+		return;
+	}
+
+	for (spec = line; *line && !isspace(*line); ++line)
+		/* nothing */;
+
+	if (!*line) {
+		fprintf(stderr, "%s: expecting size after %s\n",
+			name, what[from]);
+		return;
+	}
+
+	*line++ = '\0';
+	n = line - spec;
+	if (n > sizeof req.spec) {
+		fprintf(stderr, "%s: %s too long\n", name, what[from]);
+		return;
+	}
+
+	if (memparse(line, &line, &size) < 0 || !size) {
+		fprintf(stderr, "%s: invalid size\n", name);
+		return;
+	}
+
+	if (*line == '/')
+		if (memparse(line, &line, &alignment) < 0) {
+			fprintf(stderr, "%s: invalid alignment\n", name);
+			return;
+		}
+
+	SKIP_SPACE(line);
+	if (*line) {
+		fprintf(stderr, "%s: unknown argument(s) at the end: %s\n",
+			name, line);
+		return;
+	}
+
+
+	chunk = chunk_create(name);
+	if (!chunk)
+		return;
+
+	fprintf(stderr, "%s: allocating %p/%p\n", name,
+		(void *)size, (void *)alignment);
+
+	req.magic     = CMA_MAGIC;
+	req.type      = from ? CMA_REQ_FROM_REG : CMA_REQ_DEV_KIND;
+	req.size      = size;
+	req.alignment = alignment;
+	req.start     = 0;
+
+	memcpy(req.spec, spec, n);
+	memset(req.spec + n, '\0', sizeof req.spec - n);
+
+	ret = ioctl(chunk->fd, IOCTL_CMA_ALLOC, &req);
+	if (ret < 0) {
+		fprintf(stderr, "%s: cma_alloc: %s\n", name, strerror(errno));
+		chunk_destroy(chunk);
+	} else {
+		chunk_add(chunk);
+		chunk->size  = req.size;
+		chunk->start = req.start;
+
+		printf("%3d: %p@%p\n", chunk->fd,
+		       (void *)chunk->size, (void *)chunk->start);
+	}
+}
+
+
+static void cmd_free(char *name, char *line)
+{
+	struct chunk *chunk;
+
+	SKIP_SPACE(line);
+
+	if (*line) {
+		unsigned long num;
+
+		errno = 0;
+		num = strtoul(line, &line, 10);
+
+		if (errno || num > INT_MAX) {
+			fprintf(stderr, "%s: invalid number\n", name);
+			return;
+		}
+
+		SKIP_SPACE(line);
+		if (*line) {
+			fprintf(stderr,
+				"%s: unknown arguments at the end: %s\n",
+				name, line);
+			return;
+		}
+
+		for_each(chunk)
+			if (chunk->fd == (int)num)
+				goto ok;
+		fprintf(stderr, "%s: no chunk %3lu\n", name, num);
+		return;
+
+	} else {
+		chunk = root.prev;
+		if (chunk == &root) {
+			fprintf(stderr, "%s: no chunks\n", name);
+			return;
+		}
+	}
+
+ok:
+	fprintf(stderr, "%s: freeing %p@%p\n", name,
+		(void *)chunk->size, (void *)chunk->start);
+	chunk_destroy(chunk);
+}
+
+
+static struct chunk *chunk_create(const char *prefix)
+{
+	struct chunk *chunk;
+	int fd;
+
+	chunk = malloc(sizeof *chunk);
+	if (!chunk) {
+		fprintf(stderr, "%s: %s\n", prefix, strerror(errno));
+		return NULL;
+	}
+
+	fd = open("/dev/cma", O_RDWR);
+	if (fd < 0) {
+		fprintf(stderr, "%s: /dev/cma: %s\n", prefix, strerror(errno));
+		return NULL;
+	}
+
+	chunk->prev = chunk;
+	chunk->next = chunk;
+	chunk->fd   = fd;
+	return chunk;
+}
+
+static void chunk_destroy(struct chunk *chunk)
+{
+	chunk->prev->next = chunk->next;
+	chunk->next->prev = chunk->prev;
+	close(chunk->fd);
+}
+
+static void chunk_add(struct chunk *chunk)
+{
+	chunk->next = &root;
+	chunk->prev = root.prev;
+	root.prev->next = chunk;
+	root.prev = chunk;
+}
+
+
+
+static int memparse(char *ptr, char **retptr, unsigned long *ret)
+{
+	unsigned long val;
+
+	SKIP_SPACE(ptr);
+
+	errno = 0;
+	val = strtoul(ptr, &ptr, 0);
+	if (errno)
+		return -1;
+
+	switch (*ptr) {
+	case 'G':
+	case 'g':
+		val <<= 10;
+	case 'M':
+	case 'm':
+		val <<= 10;
+	case 'K':
+	case 'k':
+		val <<= 10;
+		++ptr;
+	}
+
+	if (retptr) {
+		SKIP_SPACE(ptr);
+		*retptr = ptr;
+	}
+
+	*ret = val;
+	return 0;
+}
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 5/9] mm: cma: Test device and application added
@ 2010-09-06  6:33   ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

This patch adds a "cma" misc device which lets user space use the
CMA API.  This device is meant for testing.  A testing application
is also provided.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/misc/Kconfig   |    8 +
 drivers/misc/Makefile  |    1 +
 drivers/misc/cma-dev.c |  202 +++++++++++++++++++++++++
 include/linux/cma.h    |   38 +++++
 tools/cma/cma-test.c   |  386 ++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 635 insertions(+), 0 deletions(-)
 create mode 100644 drivers/misc/cma-dev.c
 create mode 100644 tools/cma/cma-test.c

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 2c38d4e..ac1dd45 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -407,4 +407,12 @@ source "drivers/misc/eeprom/Kconfig"
 source "drivers/misc/cb710/Kconfig"
 source "drivers/misc/iwmc3200top/Kconfig"
 
+config CMA_DEVICE
+	tristate "CMA misc device (DEVELOPEMENT)"
+	depends on CMA_DEVELOPEMENT
+	help
+	  The CMA misc device allows allocating contiguous memory areas
+	  from user space.  This is mostly for testing of the CMA
+	  framework.
+
 endif # MISC_DEVICES
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 21b4761..b08844a 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -36,3 +36,4 @@ obj-y				+= cb710/
 obj-$(CONFIG_VMWARE_BALLOON)	+= vmware_balloon.o
 obj-$(CONFIG_ARM_CHARLCD)	+= arm-charlcd.o
 obj-$(CONFIG_PCH_PHUB)		+= pch_phub.o
+obj-$(CONFIG_CMA_DEVICE)	+= cma-dev.o
diff --git a/drivers/misc/cma-dev.c b/drivers/misc/cma-dev.c
new file mode 100644
index 0000000..5ceb432
--- /dev/null
+++ b/drivers/misc/cma-dev.c
@@ -0,0 +1,202 @@
+/*
+ * Contiguous Memory Allocator userspace driver
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+#define pr_fmt(fmt) "cma: " fmt
+
+#ifdef CONFIG_CMA_DEBUG
+#  define DEBUG
+#endif
+
+#include <linux/errno.h>       /* Error numbers */
+#include <linux/err.h>         /* IS_ERR_VALUE() */
+#include <linux/fs.h>          /* struct file */
+#include <linux/mm.h>          /* Memory stuff */
+#include <linux/mman.h>
+#include <linux/slab.h>
+#include <linux/module.h>      /* Standard module stuff */
+#include <linux/device.h>      /* struct device, dev_dbg() */
+#include <linux/types.h>       /* Just to be safe ;) */
+#include <linux/uaccess.h>     /* __copy_{to,from}_user */
+#include <linux/miscdevice.h>  /* misc_register() and company */
+
+#include <linux/cma.h>
+
+static int  cma_file_open(struct inode *inode, struct file *file);
+static int  cma_file_release(struct inode *inode, struct file *file);
+static long cma_file_ioctl(struct file *file, unsigned cmd, unsigned long arg);
+static int  cma_file_mmap(struct file *file, struct vm_area_struct *vma);
+
+
+static struct miscdevice cma_miscdev = {
+	.minor = MISC_DYNAMIC_MINOR,
+	.name  = "cma",
+	.fops  = &(const struct file_operations) {
+		.owner          = THIS_MODULE,
+		.open           = cma_file_open,
+		.release        = cma_file_release,
+		.unlocked_ioctl = cma_file_ioctl,
+		.mmap           = cma_file_mmap,
+	},
+};
+#define cma_dev (cma_miscdev.this_device)
+
+
+#define cma_file_start(file) (((dma_addr_t *)(file)->private_data)[0])
+#define cma_file_size(file)  (((dma_addr_t *)(file)->private_data)[1])
+
+
+static int  cma_file_open(struct inode *inode, struct file *file)
+{
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	file->private_data = NULL;
+
+	return 0;
+}
+
+
+static int  cma_file_release(struct inode *inode, struct file *file)
+{
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	if (file->private_data) {
+		cma_free(cma_file_start(file));
+		kfree(file->private_data);
+	}
+
+	return 0;
+}
+
+
+static long cma_file_ioctl(struct file *file, unsigned cmd, unsigned long arg)
+{
+	struct cma_alloc_request req;
+	unsigned long addr;
+	long ret;
+
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	if (cmd != IOCTL_CMA_ALLOC)
+		return -ENOTTY;
+
+	if (!arg)
+		return -EINVAL;
+
+	if (file->private_data) /* Already allocated */
+		return -EBADFD;
+
+	if (copy_from_user(&req, (void *)arg, sizeof req))
+		return -EFAULT;
+
+	if (req.magic != CMA_MAGIC)
+		return -ENOTTY;
+
+	if (req.type != CMA_REQ_DEV_KIND && req.type != CMA_REQ_FROM_REG)
+		return -EINVAL;
+
+	/* May happen on 32 bit system. */
+	if (req.size > ~(typeof(req.size))0 ||
+	    req.alignment > ~(typeof(req.alignment))0)
+		return -EINVAL;
+
+	if (strnlen(req.spec, sizeof req.spec) >= sizeof req.spec)
+		return -EINVAL;
+
+
+	file->private_data = kmalloc(2 * sizeof(dma_addr_t), GFP_KERNEL);
+	if (!file->private_data)
+		return -ENOSPC;
+
+
+	if (req.type == CMA_REQ_DEV_KIND) {
+		struct device fake_device;
+		char *kind;
+
+		fake_device.init_name = req.spec;
+		fake_device.kobj.name = req.spec;
+
+		kind = strrchr(req.spec, '/');
+		if (kind)
+			*kind++ = '\0';
+
+		addr = cma_alloc(&fake_device, kind, req.size, req.alignment);
+	} else {
+		addr = cma_alloc_from(req.spec, req.size, req.alignment);
+	}
+
+	if (IS_ERR_VALUE(addr)) {
+		ret = addr;
+		goto error_priv;
+	}
+
+
+	if (put_user(addr, (typeof(req.start) *)(arg + offsetof(typeof(req),
+								start)))) {
+		ret = -EFAULT;
+		goto error_put;
+	}
+
+	cma_file_start(file) = addr;
+	cma_file_size(file) = req.size;
+
+	dev_dbg(cma_dev, "allocated %p@%p\n",
+		(void *)(dma_addr_t)req.size, (void *)addr);
+
+	return 0;
+
+error_put:
+	cma_free(addr);
+error_priv:
+	kfree(file->private_data);
+	file->private_data = NULL;
+	return ret;
+}
+
+
+static int  cma_file_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	unsigned long pgoff, offset, length;
+
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	if (!file->private_data)
+		return -EBADFD;
+
+	pgoff  = vma->vm_pgoff;
+	offset = pgoff << PAGE_SHIFT;
+	length = vma->vm_end - vma->vm_start;
+
+	if (offset          >= cma_file_size(file)
+	 || length          >  cma_file_size(file)
+	 || offset + length >  cma_file_size(file))
+		return -ENOSPC;
+
+	return remap_pfn_range(vma, vma->vm_start,
+			       __phys_to_pfn(cma_file_start(file) + offset),
+			       length, vma->vm_page_prot);
+}
+
+
+
+static int __init cma_dev_init(void)
+{
+	int ret = misc_register(&cma_miscdev);
+	pr_debug("miscdev: register returned: %d\n", ret);
+	return ret;
+}
+module_init(cma_dev_init);
+
+static void __exit cma_dev_exit(void)
+{
+	dev_dbg(cma_dev, "deregisterring\n");
+	misc_deregister(&cma_miscdev);
+}
+module_exit(cma_dev_exit);
diff --git a/include/linux/cma.h b/include/linux/cma.h
index d0f41f4..409bae4 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -11,6 +11,44 @@
  * See Documentation/contiguous-memory.txt for details.
  */
 
+#include <linux/ioctl.h>
+#include <linux/types.h>
+
+
+#define CMA_MAGIC (('c' << 24) | ('M' << 16) | ('a' << 8) | 0x42)
+
+enum {
+	CMA_REQ_DEV_KIND,
+	CMA_REQ_FROM_REG
+};
+
+/**
+ * An information about area exportable to user space.
+ * @magic:	must always be CMA_MAGIC.
+ * @type:	type of the request.
+ * @spec:	either "dev/kind\0" or "regions\0" depending on @type.
+ *		In any case, the string must be NUL terminated.
+ *		additionally, in the latter case scanning stops at
+ *		semicolon (';').
+ * @size:	size of the chunk to allocate.
+ * @alignment:	desired alignment of the chunk (must be power of two or zero).
+ * @start:	when ioctl() finishes this stores physical address of the chunk.
+ */
+struct cma_alloc_request {
+	__u32 magic;
+	__u32 type;
+
+	/* __u64 to be compatible accross 32 and 64 bit systems. */
+	__u64 size;
+	__u64 alignment;
+	__u64 start;
+
+	char spec[32];
+};
+
+#define IOCTL_CMA_ALLOC    _IOWR('p', 0, struct cma_alloc_request)
+
+
 /***************************** Kernel level API *****************************/
 
 #ifdef __KERNEL__
diff --git a/tools/cma/cma-test.c b/tools/cma/cma-test.c
new file mode 100644
index 0000000..4aa6c5c
--- /dev/null
+++ b/tools/cma/cma-test.c
@@ -0,0 +1,386 @@
+/*
+ * cma-test.c -- CMA testing application
+ *
+ * Copyright (C) 2010 Samsung Electronics
+ *                    Author: Michal Nazarewicz <m.nazarewicz@samsung.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+/* $(CROSS_COMPILE)gcc -Wall -Wextra -g -o cma-test cma-test.c  */
+
+#include <sys/ioctl.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+
+#include <fcntl.h>
+#include <unistd.h>
+
+#include <ctype.h>
+#include <errno.h>
+#include <limits.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <linux/cma.h>
+
+
+static void handle_command(char *line);
+
+int main(void)
+{
+	unsigned no = 1;
+	char line[1024];
+	int skip = 0;
+
+	fputs("commands:\n"
+	      " l or list                                      list allocated chunks\n"
+	      " a or alloc  <dev>/<kind> <size>[/<alignment>]  allocate chunk\n"
+	      " A or afrom  <regions>    <size>[/<alignment>]  allocate from region(s)\n"
+	      " f or free   [<num>]                            free an chunk\n"
+	      " # ...                                          comment\n"
+	      " <empty line>                                   repeat previous\n"
+	      "\n", stderr);
+
+	while (fgets(line, sizeof line, stdin)) {
+		char *nl = strchr(line, '\n');
+		if (nl) {
+			if (skip) {
+				fprintf(stderr, "cma: %d: line too long\n", no);
+				skip = 0;
+			} else {
+				*nl = '\0';
+				handle_command(line);
+			}
+			++no;
+		} else {
+			skip = 1;
+		}
+	}
+
+	if (skip)
+		fprintf(stderr, "cma: %d: no new line at EOF\n", no);
+	return 0;
+}
+
+
+
+static void cmd_list(char *name, char *line);
+static void cmd_alloc(char *name, char *line);
+static void cmd_alloc_from(char *name, char *line);
+static void cmd_free(char *name, char *line);
+
+static const struct command {
+	const char name[8];
+	void (*handle)(char *name, char *line);
+} commands[] = {
+	{ "list",  cmd_list },
+	{ "l",     cmd_list },
+	{ "alloc", cmd_alloc },
+	{ "a",     cmd_alloc },
+	{ "afrom", cmd_alloc_from },
+	{ "A",     cmd_alloc_from },
+	{ "free",  cmd_free },
+	{ "f",     cmd_free },
+	{ "",      NULL }
+};
+
+
+#define SKIP_SPACE(ch) do { while (isspace(*(ch))) ++(ch); } while (0)
+
+
+static void handle_command(char *line)
+{
+	static char last_line[1024];
+
+	const struct command *cmd;
+	char *name;
+
+	SKIP_SPACE(line);
+	if (*line == '#')
+		return;
+
+	if (!*line)
+		strcpy(line, last_line);
+	else
+		strcpy(last_line, line);
+
+	name = line;
+	while (*line && !isspace(*line))
+		++line;
+
+	if (*line) {
+		*line = '\0';
+		++line;
+	}
+
+	for (cmd = commands; *(cmd->name); ++cmd)
+		if (!strcmp(name, cmd->name)) {
+			cmd->handle(name, line);
+			return;
+		}
+
+	fprintf(stderr, "%s: unknown command\n", name);
+}
+
+
+
+struct chunk {
+	struct chunk *next, *prev;
+	int fd;
+	unsigned long size;
+	unsigned long start;
+};
+
+static struct chunk root = {
+	.next = &root,
+	.prev = &root,
+};
+
+#define for_each(a) for (a = root.next; a != &root; a = a->next)
+
+static struct chunk *chunk_create(const char *prefix);
+static void chunk_destroy(struct chunk *chunk);
+static void chunk_add(struct chunk *chunk);
+
+static int memparse(char *ptr, char **retptr, unsigned long *ret);
+
+
+static void cmd_list(char *name, char *line)
+{
+	struct chunk *chunk;
+
+	(void)name; (void)line;
+
+	for_each(chunk)
+		printf("%3d: %p@%p\n", chunk->fd,
+		       (void *)chunk->size, (void *)chunk->start);
+}
+
+
+static void __cma_alloc(char *name, char *line, int from);
+
+static void cmd_alloc(char *name, char *line)
+{
+	__cma_alloc(name, line, 0);
+}
+
+static void cmd_alloc_from(char *name, char *line)
+{
+	__cma_alloc(name, line, 1);
+}
+
+static void __cma_alloc(char *name, char *line, int from)
+{
+	static const char *what[2] = { "dev/kind", "regions" };
+
+	unsigned long size, alignment = 0;
+	struct cma_alloc_request req;
+	struct chunk *chunk;
+	char *spec;
+	size_t n;
+	int ret;
+
+	SKIP_SPACE(line);
+	if (!*line) {
+		fprintf(stderr, "%s: expecting %s\n", name, what[from]);
+		return;
+	}
+
+	for (spec = line; *line && !isspace(*line); ++line)
+		/* nothing */;
+
+	if (!*line) {
+		fprintf(stderr, "%s: expecting size after %s\n",
+			name, what[from]);
+		return;
+	}
+
+	*line++ = '\0';
+	n = line - spec;
+	if (n > sizeof req.spec) {
+		fprintf(stderr, "%s: %s too long\n", name, what[from]);
+		return;
+	}
+
+	if (memparse(line, &line, &size) < 0 || !size) {
+		fprintf(stderr, "%s: invalid size\n", name);
+		return;
+	}
+
+	if (*line == '/')
+		if (memparse(line, &line, &alignment) < 0) {
+			fprintf(stderr, "%s: invalid alignment\n", name);
+			return;
+		}
+
+	SKIP_SPACE(line);
+	if (*line) {
+		fprintf(stderr, "%s: unknown argument(s) at the end: %s\n",
+			name, line);
+		return;
+	}
+
+
+	chunk = chunk_create(name);
+	if (!chunk)
+		return;
+
+	fprintf(stderr, "%s: allocating %p/%p\n", name,
+		(void *)size, (void *)alignment);
+
+	req.magic     = CMA_MAGIC;
+	req.type      = from ? CMA_REQ_FROM_REG : CMA_REQ_DEV_KIND;
+	req.size      = size;
+	req.alignment = alignment;
+	req.start     = 0;
+
+	memcpy(req.spec, spec, n);
+	memset(req.spec + n, '\0', sizeof req.spec - n);
+
+	ret = ioctl(chunk->fd, IOCTL_CMA_ALLOC, &req);
+	if (ret < 0) {
+		fprintf(stderr, "%s: cma_alloc: %s\n", name, strerror(errno));
+		chunk_destroy(chunk);
+	} else {
+		chunk_add(chunk);
+		chunk->size  = req.size;
+		chunk->start = req.start;
+
+		printf("%3d: %p@%p\n", chunk->fd,
+		       (void *)chunk->size, (void *)chunk->start);
+	}
+}
+
+
+static void cmd_free(char *name, char *line)
+{
+	struct chunk *chunk;
+
+	SKIP_SPACE(line);
+
+	if (*line) {
+		unsigned long num;
+
+		errno = 0;
+		num = strtoul(line, &line, 10);
+
+		if (errno || num > INT_MAX) {
+			fprintf(stderr, "%s: invalid number\n", name);
+			return;
+		}
+
+		SKIP_SPACE(line);
+		if (*line) {
+			fprintf(stderr,
+				"%s: unknown arguments at the end: %s\n",
+				name, line);
+			return;
+		}
+
+		for_each(chunk)
+			if (chunk->fd == (int)num)
+				goto ok;
+		fprintf(stderr, "%s: no chunk %3lu\n", name, num);
+		return;
+
+	} else {
+		chunk = root.prev;
+		if (chunk == &root) {
+			fprintf(stderr, "%s: no chunks\n", name);
+			return;
+		}
+	}
+
+ok:
+	fprintf(stderr, "%s: freeing %p@%p\n", name,
+		(void *)chunk->size, (void *)chunk->start);
+	chunk_destroy(chunk);
+}
+
+
+static struct chunk *chunk_create(const char *prefix)
+{
+	struct chunk *chunk;
+	int fd;
+
+	chunk = malloc(sizeof *chunk);
+	if (!chunk) {
+		fprintf(stderr, "%s: %s\n", prefix, strerror(errno));
+		return NULL;
+	}
+
+	fd = open("/dev/cma", O_RDWR);
+	if (fd < 0) {
+		fprintf(stderr, "%s: /dev/cma: %s\n", prefix, strerror(errno));
+		return NULL;
+	}
+
+	chunk->prev = chunk;
+	chunk->next = chunk;
+	chunk->fd   = fd;
+	return chunk;
+}
+
+static void chunk_destroy(struct chunk *chunk)
+{
+	chunk->prev->next = chunk->next;
+	chunk->next->prev = chunk->prev;
+	close(chunk->fd);
+}
+
+static void chunk_add(struct chunk *chunk)
+{
+	chunk->next = &root;
+	chunk->prev = root.prev;
+	root.prev->next = chunk;
+	root.prev = chunk;
+}
+
+
+
+static int memparse(char *ptr, char **retptr, unsigned long *ret)
+{
+	unsigned long val;
+
+	SKIP_SPACE(ptr);
+
+	errno = 0;
+	val = strtoul(ptr, &ptr, 0);
+	if (errno)
+		return -1;
+
+	switch (*ptr) {
+	case 'G':
+	case 'g':
+		val <<= 10;
+	case 'M':
+	case 'm':
+		val <<= 10;
+	case 'K':
+	case 'k':
+		val <<= 10;
+		++ptr;
+	}
+
+	if (retptr) {
+		SKIP_SPACE(ptr);
+		*retptr = ptr;
+	}
+
+	*ret = val;
+	return 0;
+}
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 5/9] mm: cma: Test device and application added
@ 2010-09-06  6:33   ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds a "cma" misc device which lets user space use the
CMA API.  This device is meant for testing.  A testing application
is also provided.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/misc/Kconfig   |    8 +
 drivers/misc/Makefile  |    1 +
 drivers/misc/cma-dev.c |  202 +++++++++++++++++++++++++
 include/linux/cma.h    |   38 +++++
 tools/cma/cma-test.c   |  386 ++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 635 insertions(+), 0 deletions(-)
 create mode 100644 drivers/misc/cma-dev.c
 create mode 100644 tools/cma/cma-test.c

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 2c38d4e..ac1dd45 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -407,4 +407,12 @@ source "drivers/misc/eeprom/Kconfig"
 source "drivers/misc/cb710/Kconfig"
 source "drivers/misc/iwmc3200top/Kconfig"
 
+config CMA_DEVICE
+	tristate "CMA misc device (DEVELOPEMENT)"
+	depends on CMA_DEVELOPEMENT
+	help
+	  The CMA misc device allows allocating contiguous memory areas
+	  from user space.  This is mostly for testing of the CMA
+	  framework.
+
 endif # MISC_DEVICES
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 21b4761..b08844a 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -36,3 +36,4 @@ obj-y				+= cb710/
 obj-$(CONFIG_VMWARE_BALLOON)	+= vmware_balloon.o
 obj-$(CONFIG_ARM_CHARLCD)	+= arm-charlcd.o
 obj-$(CONFIG_PCH_PHUB)		+= pch_phub.o
+obj-$(CONFIG_CMA_DEVICE)	+= cma-dev.o
diff --git a/drivers/misc/cma-dev.c b/drivers/misc/cma-dev.c
new file mode 100644
index 0000000..5ceb432
--- /dev/null
+++ b/drivers/misc/cma-dev.c
@@ -0,0 +1,202 @@
+/*
+ * Contiguous Memory Allocator userspace driver
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz at samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+#define pr_fmt(fmt) "cma: " fmt
+
+#ifdef CONFIG_CMA_DEBUG
+#  define DEBUG
+#endif
+
+#include <linux/errno.h>       /* Error numbers */
+#include <linux/err.h>         /* IS_ERR_VALUE() */
+#include <linux/fs.h>          /* struct file */
+#include <linux/mm.h>          /* Memory stuff */
+#include <linux/mman.h>
+#include <linux/slab.h>
+#include <linux/module.h>      /* Standard module stuff */
+#include <linux/device.h>      /* struct device, dev_dbg() */
+#include <linux/types.h>       /* Just to be safe ;) */
+#include <linux/uaccess.h>     /* __copy_{to,from}_user */
+#include <linux/miscdevice.h>  /* misc_register() and company */
+
+#include <linux/cma.h>
+
+static int  cma_file_open(struct inode *inode, struct file *file);
+static int  cma_file_release(struct inode *inode, struct file *file);
+static long cma_file_ioctl(struct file *file, unsigned cmd, unsigned long arg);
+static int  cma_file_mmap(struct file *file, struct vm_area_struct *vma);
+
+
+static struct miscdevice cma_miscdev = {
+	.minor = MISC_DYNAMIC_MINOR,
+	.name  = "cma",
+	.fops  = &(const struct file_operations) {
+		.owner          = THIS_MODULE,
+		.open           = cma_file_open,
+		.release        = cma_file_release,
+		.unlocked_ioctl = cma_file_ioctl,
+		.mmap           = cma_file_mmap,
+	},
+};
+#define cma_dev (cma_miscdev.this_device)
+
+
+#define cma_file_start(file) (((dma_addr_t *)(file)->private_data)[0])
+#define cma_file_size(file)  (((dma_addr_t *)(file)->private_data)[1])
+
+
+static int  cma_file_open(struct inode *inode, struct file *file)
+{
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	file->private_data = NULL;
+
+	return 0;
+}
+
+
+static int  cma_file_release(struct inode *inode, struct file *file)
+{
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	if (file->private_data) {
+		cma_free(cma_file_start(file));
+		kfree(file->private_data);
+	}
+
+	return 0;
+}
+
+
+static long cma_file_ioctl(struct file *file, unsigned cmd, unsigned long arg)
+{
+	struct cma_alloc_request req;
+	unsigned long addr;
+	long ret;
+
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	if (cmd != IOCTL_CMA_ALLOC)
+		return -ENOTTY;
+
+	if (!arg)
+		return -EINVAL;
+
+	if (file->private_data) /* Already allocated */
+		return -EBADFD;
+
+	if (copy_from_user(&req, (void *)arg, sizeof req))
+		return -EFAULT;
+
+	if (req.magic != CMA_MAGIC)
+		return -ENOTTY;
+
+	if (req.type != CMA_REQ_DEV_KIND && req.type != CMA_REQ_FROM_REG)
+		return -EINVAL;
+
+	/* May happen on 32 bit system. */
+	if (req.size > ~(typeof(req.size))0 ||
+	    req.alignment > ~(typeof(req.alignment))0)
+		return -EINVAL;
+
+	if (strnlen(req.spec, sizeof req.spec) >= sizeof req.spec)
+		return -EINVAL;
+
+
+	file->private_data = kmalloc(2 * sizeof(dma_addr_t), GFP_KERNEL);
+	if (!file->private_data)
+		return -ENOSPC;
+
+
+	if (req.type == CMA_REQ_DEV_KIND) {
+		struct device fake_device;
+		char *kind;
+
+		fake_device.init_name = req.spec;
+		fake_device.kobj.name = req.spec;
+
+		kind = strrchr(req.spec, '/');
+		if (kind)
+			*kind++ = '\0';
+
+		addr = cma_alloc(&fake_device, kind, req.size, req.alignment);
+	} else {
+		addr = cma_alloc_from(req.spec, req.size, req.alignment);
+	}
+
+	if (IS_ERR_VALUE(addr)) {
+		ret = addr;
+		goto error_priv;
+	}
+
+
+	if (put_user(addr, (typeof(req.start) *)(arg + offsetof(typeof(req),
+								start)))) {
+		ret = -EFAULT;
+		goto error_put;
+	}
+
+	cma_file_start(file) = addr;
+	cma_file_size(file) = req.size;
+
+	dev_dbg(cma_dev, "allocated %p@%p\n",
+		(void *)(dma_addr_t)req.size, (void *)addr);
+
+	return 0;
+
+error_put:
+	cma_free(addr);
+error_priv:
+	kfree(file->private_data);
+	file->private_data = NULL;
+	return ret;
+}
+
+
+static int  cma_file_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	unsigned long pgoff, offset, length;
+
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	if (!file->private_data)
+		return -EBADFD;
+
+	pgoff  = vma->vm_pgoff;
+	offset = pgoff << PAGE_SHIFT;
+	length = vma->vm_end - vma->vm_start;
+
+	if (offset          >= cma_file_size(file)
+	 || length          >  cma_file_size(file)
+	 || offset + length >  cma_file_size(file))
+		return -ENOSPC;
+
+	return remap_pfn_range(vma, vma->vm_start,
+			       __phys_to_pfn(cma_file_start(file) + offset),
+			       length, vma->vm_page_prot);
+}
+
+
+
+static int __init cma_dev_init(void)
+{
+	int ret = misc_register(&cma_miscdev);
+	pr_debug("miscdev: register returned: %d\n", ret);
+	return ret;
+}
+module_init(cma_dev_init);
+
+static void __exit cma_dev_exit(void)
+{
+	dev_dbg(cma_dev, "deregisterring\n");
+	misc_deregister(&cma_miscdev);
+}
+module_exit(cma_dev_exit);
diff --git a/include/linux/cma.h b/include/linux/cma.h
index d0f41f4..409bae4 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -11,6 +11,44 @@
  * See Documentation/contiguous-memory.txt for details.
  */
 
+#include <linux/ioctl.h>
+#include <linux/types.h>
+
+
+#define CMA_MAGIC (('c' << 24) | ('M' << 16) | ('a' << 8) | 0x42)
+
+enum {
+	CMA_REQ_DEV_KIND,
+	CMA_REQ_FROM_REG
+};
+
+/**
+ * An information about area exportable to user space.
+ * @magic:	must always be CMA_MAGIC.
+ * @type:	type of the request.
+ * @spec:	either "dev/kind\0" or "regions\0" depending on @type.
+ *		In any case, the string must be NUL terminated.
+ *		additionally, in the latter case scanning stops at
+ *		semicolon (';').
+ * @size:	size of the chunk to allocate.
+ * @alignment:	desired alignment of the chunk (must be power of two or zero).
+ * @start:	when ioctl() finishes this stores physical address of the chunk.
+ */
+struct cma_alloc_request {
+	__u32 magic;
+	__u32 type;
+
+	/* __u64 to be compatible accross 32 and 64 bit systems. */
+	__u64 size;
+	__u64 alignment;
+	__u64 start;
+
+	char spec[32];
+};
+
+#define IOCTL_CMA_ALLOC    _IOWR('p', 0, struct cma_alloc_request)
+
+
 /***************************** Kernel level API *****************************/
 
 #ifdef __KERNEL__
diff --git a/tools/cma/cma-test.c b/tools/cma/cma-test.c
new file mode 100644
index 0000000..4aa6c5c
--- /dev/null
+++ b/tools/cma/cma-test.c
@@ -0,0 +1,386 @@
+/*
+ * cma-test.c -- CMA testing application
+ *
+ * Copyright (C) 2010 Samsung Electronics
+ *                    Author: Michal Nazarewicz <m.nazarewicz@samsung.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+/* $(CROSS_COMPILE)gcc -Wall -Wextra -g -o cma-test cma-test.c  */
+
+#include <sys/ioctl.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+
+#include <fcntl.h>
+#include <unistd.h>
+
+#include <ctype.h>
+#include <errno.h>
+#include <limits.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <linux/cma.h>
+
+
+static void handle_command(char *line);
+
+int main(void)
+{
+	unsigned no = 1;
+	char line[1024];
+	int skip = 0;
+
+	fputs("commands:\n"
+	      " l or list                                      list allocated chunks\n"
+	      " a or alloc  <dev>/<kind> <size>[/<alignment>]  allocate chunk\n"
+	      " A or afrom  <regions>    <size>[/<alignment>]  allocate from region(s)\n"
+	      " f or free   [<num>]                            free an chunk\n"
+	      " # ...                                          comment\n"
+	      " <empty line>                                   repeat previous\n"
+	      "\n", stderr);
+
+	while (fgets(line, sizeof line, stdin)) {
+		char *nl = strchr(line, '\n');
+		if (nl) {
+			if (skip) {
+				fprintf(stderr, "cma: %d: line too long\n", no);
+				skip = 0;
+			} else {
+				*nl = '\0';
+				handle_command(line);
+			}
+			++no;
+		} else {
+			skip = 1;
+		}
+	}
+
+	if (skip)
+		fprintf(stderr, "cma: %d: no new line at EOF\n", no);
+	return 0;
+}
+
+
+
+static void cmd_list(char *name, char *line);
+static void cmd_alloc(char *name, char *line);
+static void cmd_alloc_from(char *name, char *line);
+static void cmd_free(char *name, char *line);
+
+static const struct command {
+	const char name[8];
+	void (*handle)(char *name, char *line);
+} commands[] = {
+	{ "list",  cmd_list },
+	{ "l",     cmd_list },
+	{ "alloc", cmd_alloc },
+	{ "a",     cmd_alloc },
+	{ "afrom", cmd_alloc_from },
+	{ "A",     cmd_alloc_from },
+	{ "free",  cmd_free },
+	{ "f",     cmd_free },
+	{ "",      NULL }
+};
+
+
+#define SKIP_SPACE(ch) do { while (isspace(*(ch))) ++(ch); } while (0)
+
+
+static void handle_command(char *line)
+{
+	static char last_line[1024];
+
+	const struct command *cmd;
+	char *name;
+
+	SKIP_SPACE(line);
+	if (*line == '#')
+		return;
+
+	if (!*line)
+		strcpy(line, last_line);
+	else
+		strcpy(last_line, line);
+
+	name = line;
+	while (*line && !isspace(*line))
+		++line;
+
+	if (*line) {
+		*line = '\0';
+		++line;
+	}
+
+	for (cmd = commands; *(cmd->name); ++cmd)
+		if (!strcmp(name, cmd->name)) {
+			cmd->handle(name, line);
+			return;
+		}
+
+	fprintf(stderr, "%s: unknown command\n", name);
+}
+
+
+
+struct chunk {
+	struct chunk *next, *prev;
+	int fd;
+	unsigned long size;
+	unsigned long start;
+};
+
+static struct chunk root = {
+	.next = &root,
+	.prev = &root,
+};
+
+#define for_each(a) for (a = root.next; a != &root; a = a->next)
+
+static struct chunk *chunk_create(const char *prefix);
+static void chunk_destroy(struct chunk *chunk);
+static void chunk_add(struct chunk *chunk);
+
+static int memparse(char *ptr, char **retptr, unsigned long *ret);
+
+
+static void cmd_list(char *name, char *line)
+{
+	struct chunk *chunk;
+
+	(void)name; (void)line;
+
+	for_each(chunk)
+		printf("%3d: %p@%p\n", chunk->fd,
+		       (void *)chunk->size, (void *)chunk->start);
+}
+
+
+static void __cma_alloc(char *name, char *line, int from);
+
+static void cmd_alloc(char *name, char *line)
+{
+	__cma_alloc(name, line, 0);
+}
+
+static void cmd_alloc_from(char *name, char *line)
+{
+	__cma_alloc(name, line, 1);
+}
+
+static void __cma_alloc(char *name, char *line, int from)
+{
+	static const char *what[2] = { "dev/kind", "regions" };
+
+	unsigned long size, alignment = 0;
+	struct cma_alloc_request req;
+	struct chunk *chunk;
+	char *spec;
+	size_t n;
+	int ret;
+
+	SKIP_SPACE(line);
+	if (!*line) {
+		fprintf(stderr, "%s: expecting %s\n", name, what[from]);
+		return;
+	}
+
+	for (spec = line; *line && !isspace(*line); ++line)
+		/* nothing */;
+
+	if (!*line) {
+		fprintf(stderr, "%s: expecting size after %s\n",
+			name, what[from]);
+		return;
+	}
+
+	*line++ = '\0';
+	n = line - spec;
+	if (n > sizeof req.spec) {
+		fprintf(stderr, "%s: %s too long\n", name, what[from]);
+		return;
+	}
+
+	if (memparse(line, &line, &size) < 0 || !size) {
+		fprintf(stderr, "%s: invalid size\n", name);
+		return;
+	}
+
+	if (*line == '/')
+		if (memparse(line, &line, &alignment) < 0) {
+			fprintf(stderr, "%s: invalid alignment\n", name);
+			return;
+		}
+
+	SKIP_SPACE(line);
+	if (*line) {
+		fprintf(stderr, "%s: unknown argument(s)@the end: %s\n",
+			name, line);
+		return;
+	}
+
+
+	chunk = chunk_create(name);
+	if (!chunk)
+		return;
+
+	fprintf(stderr, "%s: allocating %p/%p\n", name,
+		(void *)size, (void *)alignment);
+
+	req.magic     = CMA_MAGIC;
+	req.type      = from ? CMA_REQ_FROM_REG : CMA_REQ_DEV_KIND;
+	req.size      = size;
+	req.alignment = alignment;
+	req.start     = 0;
+
+	memcpy(req.spec, spec, n);
+	memset(req.spec + n, '\0', sizeof req.spec - n);
+
+	ret = ioctl(chunk->fd, IOCTL_CMA_ALLOC, &req);
+	if (ret < 0) {
+		fprintf(stderr, "%s: cma_alloc: %s\n", name, strerror(errno));
+		chunk_destroy(chunk);
+	} else {
+		chunk_add(chunk);
+		chunk->size  = req.size;
+		chunk->start = req.start;
+
+		printf("%3d: %p@%p\n", chunk->fd,
+		       (void *)chunk->size, (void *)chunk->start);
+	}
+}
+
+
+static void cmd_free(char *name, char *line)
+{
+	struct chunk *chunk;
+
+	SKIP_SPACE(line);
+
+	if (*line) {
+		unsigned long num;
+
+		errno = 0;
+		num = strtoul(line, &line, 10);
+
+		if (errno || num > INT_MAX) {
+			fprintf(stderr, "%s: invalid number\n", name);
+			return;
+		}
+
+		SKIP_SPACE(line);
+		if (*line) {
+			fprintf(stderr,
+				"%s: unknown arguments at the end: %s\n",
+				name, line);
+			return;
+		}
+
+		for_each(chunk)
+			if (chunk->fd == (int)num)
+				goto ok;
+		fprintf(stderr, "%s: no chunk %3lu\n", name, num);
+		return;
+
+	} else {
+		chunk = root.prev;
+		if (chunk == &root) {
+			fprintf(stderr, "%s: no chunks\n", name);
+			return;
+		}
+	}
+
+ok:
+	fprintf(stderr, "%s: freeing %p@%p\n", name,
+		(void *)chunk->size, (void *)chunk->start);
+	chunk_destroy(chunk);
+}
+
+
+static struct chunk *chunk_create(const char *prefix)
+{
+	struct chunk *chunk;
+	int fd;
+
+	chunk = malloc(sizeof *chunk);
+	if (!chunk) {
+		fprintf(stderr, "%s: %s\n", prefix, strerror(errno));
+		return NULL;
+	}
+
+	fd = open("/dev/cma", O_RDWR);
+	if (fd < 0) {
+		fprintf(stderr, "%s: /dev/cma: %s\n", prefix, strerror(errno));
+		return NULL;
+	}
+
+	chunk->prev = chunk;
+	chunk->next = chunk;
+	chunk->fd   = fd;
+	return chunk;
+}
+
+static void chunk_destroy(struct chunk *chunk)
+{
+	chunk->prev->next = chunk->next;
+	chunk->next->prev = chunk->prev;
+	close(chunk->fd);
+}
+
+static void chunk_add(struct chunk *chunk)
+{
+	chunk->next = &root;
+	chunk->prev = root.prev;
+	root.prev->next = chunk;
+	root.prev = chunk;
+}
+
+
+
+static int memparse(char *ptr, char **retptr, unsigned long *ret)
+{
+	unsigned long val;
+
+	SKIP_SPACE(ptr);
+
+	errno = 0;
+	val = strtoul(ptr, &ptr, 0);
+	if (errno)
+		return -1;
+
+	switch (*ptr) {
+	case 'G':
+	case 'g':
+		val <<= 10;
+	case 'M':
+	case 'm':
+		val <<= 10;
+	case 'K':
+	case 'k':
+		val <<= 10;
+		++ptr;
+	}
+
+	if (retptr) {
+		SKIP_SPACE(ptr);
+		*retptr = ptr;
+	}
+
+	*ret = val;
+	return 0;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 6/9] ARM: cma: Added CMA to Aquila, Goni and c210 universal boards
  2010-09-06  6:33 ` Michal Nazarewicz
  (?)
@ 2010-09-06  6:33   ` Michal Nazarewicz
  -1 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

This commit adds CMA memory reservation code to Aquila, Goni and c210
universal boards.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/mach-s5pv210/mach-aquila.c         |   31 +++++++++++++++++++++++++++
 arch/arm/mach-s5pv210/mach-goni.c           |   31 +++++++++++++++++++++++++++
 arch/arm/mach-s5pv310/mach-universal_c210.c |   23 ++++++++++++++++++++
 3 files changed, 85 insertions(+), 0 deletions(-)

diff --git a/arch/arm/mach-s5pv210/mach-aquila.c b/arch/arm/mach-s5pv210/mach-aquila.c
index bf772de..ff89534 100644
--- a/arch/arm/mach-s5pv210/mach-aquila.c
+++ b/arch/arm/mach-s5pv210/mach-aquila.c
@@ -19,6 +19,7 @@
 #include <linux/gpio_keys.h>
 #include <linux/input.h>
 #include <linux/gpio.h>
+#include <linux/cma.h>
 
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
@@ -486,6 +487,35 @@ static struct platform_device *aquila_devices[] __initdata = {
 	&s5p_device_fimc2,
 };
 
+static void __init aquila_reserve(void)
+{
+	static struct cma_region regions[] = {
+		{
+			.name		= "fw",
+			.size		=   1 << 20,
+			{ .alignment	= 128 << 10 },
+			.start		= 0x32000000,
+		},
+		{
+			.name		= "b1",
+			.size		=  32 << 20,
+			.start		= 0x33000000,
+		},
+		{
+			.name		= "b2",
+			.size		=  16 << 20,
+			.start		= 0x44000000,
+		},
+		{ }
+	};
+
+	static const char map[] __initconst =
+		"s5p-mfc5/f=fw;s5p-mfc5/a=b1;s5p-mfc5/b=b2;*=b1,b2";
+
+	cma_set_defaults(regions, map);
+	cma_early_regions_reserve(NULL);
+}
+
 static void __init aquila_map_io(void)
 {
 	s5p_init_io(NULL, 0, S5P_VA_CHIPID);
@@ -523,4 +553,5 @@ MACHINE_START(AQUILA, "Aquila")
 	.map_io		= aquila_map_io,
 	.init_machine	= aquila_machine_init,
 	.timer		= &s3c24xx_timer,
+	.reserve	= aquila_reserve,
 MACHINE_END
diff --git a/arch/arm/mach-s5pv210/mach-goni.c b/arch/arm/mach-s5pv210/mach-goni.c
index fdc5cca..10dc2b9 100644
--- a/arch/arm/mach-s5pv210/mach-goni.c
+++ b/arch/arm/mach-s5pv210/mach-goni.c
@@ -19,6 +19,7 @@
 #include <linux/gpio_keys.h>
 #include <linux/input.h>
 #include <linux/gpio.h>
+#include <linux/cma.h>
 
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
@@ -467,6 +468,35 @@ static struct platform_device *goni_devices[] __initdata = {
 	&s3c_device_hsmmc2,
 };
 
+static void __init goni_reserve(void)
+{
+	static struct cma_region regions[] = {
+		{
+			.name		= "fw",
+			.size		=   1 << 20,
+			{ .alignment	= 128 << 10 },
+			.start		= 0x32000000,
+		},
+		{
+			.name		= "b1",
+			.size		=  32 << 20,
+			.start		= 0x33000000,
+		},
+		{
+			.name		= "b2",
+			.size		=  16 << 20,
+			.start		= 0x44000000,
+		},
+		{ }
+	};
+
+	static const char map[] __initconst =
+		"s5p-mfc5/f=fw;s5p-mfc5/a=b1;s5p-mfc5/b=b2;*=b1,b2";
+
+	cma_set_defaults(regions, map);
+	cma_early_regions_reserve(NULL);
+}
+
 static void __init goni_map_io(void)
 {
 	s5p_init_io(NULL, 0, S5P_VA_CHIPID);
@@ -498,4 +528,5 @@ MACHINE_START(GONI, "GONI")
 	.map_io		= goni_map_io,
 	.init_machine	= goni_machine_init,
 	.timer		= &s3c24xx_timer,
+	.reserve	= goni_reserve,
 MACHINE_END
diff --git a/arch/arm/mach-s5pv310/mach-universal_c210.c b/arch/arm/mach-s5pv310/mach-universal_c210.c
index b57efae..f735c52 100644
--- a/arch/arm/mach-s5pv310/mach-universal_c210.c
+++ b/arch/arm/mach-s5pv310/mach-universal_c210.c
@@ -12,6 +12,7 @@
 #include <linux/input.h>
 #include <linux/gpio_keys.h>
 #include <linux/gpio.h>
+#include <linux/cma.h>
 
 #include <asm/mach/arch.h>
 #include <asm/mach-types.h>
@@ -126,6 +127,27 @@ static void __init universal_map_io(void)
 	s3c24xx_init_uarts(universal_uartcfgs, ARRAY_SIZE(universal_uartcfgs));
 }
 
+static void __init universal_reserve(void)
+{
+	static struct cma_region regions[] = {
+		{
+			.name		= "r",
+			.size		=  64 << 20,
+		},
+		{
+			.name		= "fw",
+			.size		=   1 << 20,
+		},
+		{ }
+	};
+
+	static const char map[] __initconst =
+		"s3c-mfc5/f=fw;*=r";
+
+	cma_set_defaults(regions, map);
+	cma_early_regions_reserve(NULL);
+}
+
 static void __init universal_machine_init(void)
 {
 #ifdef CONFIG_CACHE_L2X0
@@ -143,6 +165,7 @@ MACHINE_START(UNIVERSAL_C210, "UNIVERSAL_C210")
 	.boot_params	= S5P_PA_SDRAM + 0x100,
 	.init_irq	= s5pv310_init_irq,
 	.map_io		= universal_map_io,
+	.reserve	= universal_reserve,
 	.init_machine	= universal_machine_init,
 	.timer		= &s5pv310_timer,
 MACHINE_END
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 6/9] ARM: cma: Added CMA to Aquila, Goni and c210 universal boards
@ 2010-09-06  6:33   ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

This commit adds CMA memory reservation code to Aquila, Goni and c210
universal boards.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/mach-s5pv210/mach-aquila.c         |   31 +++++++++++++++++++++++++++
 arch/arm/mach-s5pv210/mach-goni.c           |   31 +++++++++++++++++++++++++++
 arch/arm/mach-s5pv310/mach-universal_c210.c |   23 ++++++++++++++++++++
 3 files changed, 85 insertions(+), 0 deletions(-)

diff --git a/arch/arm/mach-s5pv210/mach-aquila.c b/arch/arm/mach-s5pv210/mach-aquila.c
index bf772de..ff89534 100644
--- a/arch/arm/mach-s5pv210/mach-aquila.c
+++ b/arch/arm/mach-s5pv210/mach-aquila.c
@@ -19,6 +19,7 @@
 #include <linux/gpio_keys.h>
 #include <linux/input.h>
 #include <linux/gpio.h>
+#include <linux/cma.h>
 
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
@@ -486,6 +487,35 @@ static struct platform_device *aquila_devices[] __initdata = {
 	&s5p_device_fimc2,
 };
 
+static void __init aquila_reserve(void)
+{
+	static struct cma_region regions[] = {
+		{
+			.name		= "fw",
+			.size		=   1 << 20,
+			{ .alignment	= 128 << 10 },
+			.start		= 0x32000000,
+		},
+		{
+			.name		= "b1",
+			.size		=  32 << 20,
+			.start		= 0x33000000,
+		},
+		{
+			.name		= "b2",
+			.size		=  16 << 20,
+			.start		= 0x44000000,
+		},
+		{ }
+	};
+
+	static const char map[] __initconst =
+		"s5p-mfc5/f=fw;s5p-mfc5/a=b1;s5p-mfc5/b=b2;*=b1,b2";
+
+	cma_set_defaults(regions, map);
+	cma_early_regions_reserve(NULL);
+}
+
 static void __init aquila_map_io(void)
 {
 	s5p_init_io(NULL, 0, S5P_VA_CHIPID);
@@ -523,4 +553,5 @@ MACHINE_START(AQUILA, "Aquila")
 	.map_io		= aquila_map_io,
 	.init_machine	= aquila_machine_init,
 	.timer		= &s3c24xx_timer,
+	.reserve	= aquila_reserve,
 MACHINE_END
diff --git a/arch/arm/mach-s5pv210/mach-goni.c b/arch/arm/mach-s5pv210/mach-goni.c
index fdc5cca..10dc2b9 100644
--- a/arch/arm/mach-s5pv210/mach-goni.c
+++ b/arch/arm/mach-s5pv210/mach-goni.c
@@ -19,6 +19,7 @@
 #include <linux/gpio_keys.h>
 #include <linux/input.h>
 #include <linux/gpio.h>
+#include <linux/cma.h>
 
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
@@ -467,6 +468,35 @@ static struct platform_device *goni_devices[] __initdata = {
 	&s3c_device_hsmmc2,
 };
 
+static void __init goni_reserve(void)
+{
+	static struct cma_region regions[] = {
+		{
+			.name		= "fw",
+			.size		=   1 << 20,
+			{ .alignment	= 128 << 10 },
+			.start		= 0x32000000,
+		},
+		{
+			.name		= "b1",
+			.size		=  32 << 20,
+			.start		= 0x33000000,
+		},
+		{
+			.name		= "b2",
+			.size		=  16 << 20,
+			.start		= 0x44000000,
+		},
+		{ }
+	};
+
+	static const char map[] __initconst =
+		"s5p-mfc5/f=fw;s5p-mfc5/a=b1;s5p-mfc5/b=b2;*=b1,b2";
+
+	cma_set_defaults(regions, map);
+	cma_early_regions_reserve(NULL);
+}
+
 static void __init goni_map_io(void)
 {
 	s5p_init_io(NULL, 0, S5P_VA_CHIPID);
@@ -498,4 +528,5 @@ MACHINE_START(GONI, "GONI")
 	.map_io		= goni_map_io,
 	.init_machine	= goni_machine_init,
 	.timer		= &s3c24xx_timer,
+	.reserve	= goni_reserve,
 MACHINE_END
diff --git a/arch/arm/mach-s5pv310/mach-universal_c210.c b/arch/arm/mach-s5pv310/mach-universal_c210.c
index b57efae..f735c52 100644
--- a/arch/arm/mach-s5pv310/mach-universal_c210.c
+++ b/arch/arm/mach-s5pv310/mach-universal_c210.c
@@ -12,6 +12,7 @@
 #include <linux/input.h>
 #include <linux/gpio_keys.h>
 #include <linux/gpio.h>
+#include <linux/cma.h>
 
 #include <asm/mach/arch.h>
 #include <asm/mach-types.h>
@@ -126,6 +127,27 @@ static void __init universal_map_io(void)
 	s3c24xx_init_uarts(universal_uartcfgs, ARRAY_SIZE(universal_uartcfgs));
 }
 
+static void __init universal_reserve(void)
+{
+	static struct cma_region regions[] = {
+		{
+			.name		= "r",
+			.size		=  64 << 20,
+		},
+		{
+			.name		= "fw",
+			.size		=   1 << 20,
+		},
+		{ }
+	};
+
+	static const char map[] __initconst =
+		"s3c-mfc5/f=fw;*=r";
+
+	cma_set_defaults(regions, map);
+	cma_early_regions_reserve(NULL);
+}
+
 static void __init universal_machine_init(void)
 {
 #ifdef CONFIG_CACHE_L2X0
@@ -143,6 +165,7 @@ MACHINE_START(UNIVERSAL_C210, "UNIVERSAL_C210")
 	.boot_params	= S5P_PA_SDRAM + 0x100,
 	.init_irq	= s5pv310_init_irq,
 	.map_io		= universal_map_io,
+	.reserve	= universal_reserve,
 	.init_machine	= universal_machine_init,
 	.timer		= &s5pv310_timer,
 MACHINE_END
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 6/9] ARM: cma: Added CMA to Aquila, Goni and c210 universal boards
@ 2010-09-06  6:33   ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel

This commit adds CMA memory reservation code to Aquila, Goni and c210
universal boards.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/mach-s5pv210/mach-aquila.c         |   31 +++++++++++++++++++++++++++
 arch/arm/mach-s5pv210/mach-goni.c           |   31 +++++++++++++++++++++++++++
 arch/arm/mach-s5pv310/mach-universal_c210.c |   23 ++++++++++++++++++++
 3 files changed, 85 insertions(+), 0 deletions(-)

diff --git a/arch/arm/mach-s5pv210/mach-aquila.c b/arch/arm/mach-s5pv210/mach-aquila.c
index bf772de..ff89534 100644
--- a/arch/arm/mach-s5pv210/mach-aquila.c
+++ b/arch/arm/mach-s5pv210/mach-aquila.c
@@ -19,6 +19,7 @@
 #include <linux/gpio_keys.h>
 #include <linux/input.h>
 #include <linux/gpio.h>
+#include <linux/cma.h>
 
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
@@ -486,6 +487,35 @@ static struct platform_device *aquila_devices[] __initdata = {
 	&s5p_device_fimc2,
 };
 
+static void __init aquila_reserve(void)
+{
+	static struct cma_region regions[] = {
+		{
+			.name		= "fw",
+			.size		=   1 << 20,
+			{ .alignment	= 128 << 10 },
+			.start		= 0x32000000,
+		},
+		{
+			.name		= "b1",
+			.size		=  32 << 20,
+			.start		= 0x33000000,
+		},
+		{
+			.name		= "b2",
+			.size		=  16 << 20,
+			.start		= 0x44000000,
+		},
+		{ }
+	};
+
+	static const char map[] __initconst =
+		"s5p-mfc5/f=fw;s5p-mfc5/a=b1;s5p-mfc5/b=b2;*=b1,b2";
+
+	cma_set_defaults(regions, map);
+	cma_early_regions_reserve(NULL);
+}
+
 static void __init aquila_map_io(void)
 {
 	s5p_init_io(NULL, 0, S5P_VA_CHIPID);
@@ -523,4 +553,5 @@ MACHINE_START(AQUILA, "Aquila")
 	.map_io		= aquila_map_io,
 	.init_machine	= aquila_machine_init,
 	.timer		= &s3c24xx_timer,
+	.reserve	= aquila_reserve,
 MACHINE_END
diff --git a/arch/arm/mach-s5pv210/mach-goni.c b/arch/arm/mach-s5pv210/mach-goni.c
index fdc5cca..10dc2b9 100644
--- a/arch/arm/mach-s5pv210/mach-goni.c
+++ b/arch/arm/mach-s5pv210/mach-goni.c
@@ -19,6 +19,7 @@
 #include <linux/gpio_keys.h>
 #include <linux/input.h>
 #include <linux/gpio.h>
+#include <linux/cma.h>
 
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
@@ -467,6 +468,35 @@ static struct platform_device *goni_devices[] __initdata = {
 	&s3c_device_hsmmc2,
 };
 
+static void __init goni_reserve(void)
+{
+	static struct cma_region regions[] = {
+		{
+			.name		= "fw",
+			.size		=   1 << 20,
+			{ .alignment	= 128 << 10 },
+			.start		= 0x32000000,
+		},
+		{
+			.name		= "b1",
+			.size		=  32 << 20,
+			.start		= 0x33000000,
+		},
+		{
+			.name		= "b2",
+			.size		=  16 << 20,
+			.start		= 0x44000000,
+		},
+		{ }
+	};
+
+	static const char map[] __initconst =
+		"s5p-mfc5/f=fw;s5p-mfc5/a=b1;s5p-mfc5/b=b2;*=b1,b2";
+
+	cma_set_defaults(regions, map);
+	cma_early_regions_reserve(NULL);
+}
+
 static void __init goni_map_io(void)
 {
 	s5p_init_io(NULL, 0, S5P_VA_CHIPID);
@@ -498,4 +528,5 @@ MACHINE_START(GONI, "GONI")
 	.map_io		= goni_map_io,
 	.init_machine	= goni_machine_init,
 	.timer		= &s3c24xx_timer,
+	.reserve	= goni_reserve,
 MACHINE_END
diff --git a/arch/arm/mach-s5pv310/mach-universal_c210.c b/arch/arm/mach-s5pv310/mach-universal_c210.c
index b57efae..f735c52 100644
--- a/arch/arm/mach-s5pv310/mach-universal_c210.c
+++ b/arch/arm/mach-s5pv310/mach-universal_c210.c
@@ -12,6 +12,7 @@
 #include <linux/input.h>
 #include <linux/gpio_keys.h>
 #include <linux/gpio.h>
+#include <linux/cma.h>
 
 #include <asm/mach/arch.h>
 #include <asm/mach-types.h>
@@ -126,6 +127,27 @@ static void __init universal_map_io(void)
 	s3c24xx_init_uarts(universal_uartcfgs, ARRAY_SIZE(universal_uartcfgs));
 }
 
+static void __init universal_reserve(void)
+{
+	static struct cma_region regions[] = {
+		{
+			.name		= "r",
+			.size		=  64 << 20,
+		},
+		{
+			.name		= "fw",
+			.size		=   1 << 20,
+		},
+		{ }
+	};
+
+	static const char map[] __initconst =
+		"s3c-mfc5/f=fw;*=r";
+
+	cma_set_defaults(regions, map);
+	cma_early_regions_reserve(NULL);
+}
+
 static void __init universal_machine_init(void)
 {
 #ifdef CONFIG_CACHE_L2X0
@@ -143,6 +165,7 @@ MACHINE_START(UNIVERSAL_C210, "UNIVERSAL_C210")
 	.boot_params	= S5P_PA_SDRAM + 0x100,
 	.init_irq	= s5pv310_init_irq,
 	.map_io		= universal_map_io,
+	.reserve	= universal_reserve,
 	.init_machine	= universal_machine_init,
 	.timer		= &s5pv310_timer,
 MACHINE_END
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 7/9] mm: vcm: Virtual Contiguous Memory framework added
  2010-09-06  6:33 ` Michal Nazarewicz
  (?)
@ 2010-09-06  6:33   ` Michal Nazarewicz
  -1 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

This commit adds the Virtual Contiguous Memory framework which
provides an abstraction for virtual address space provided by
various MMUs present on the platform.

The framework uses plugable MMU drivers for hardware MMUs and
if drivers obeys some limitations it can be also used on
platforms with no MMU.

For more information see
<Documentation/virtual-contiguous-memory.txt>.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 Documentation/00-INDEX                      |    2 +
 Documentation/virtual-contiguous-memory.txt |  853 ++++++++++++++++++++++++
 include/linux/vcm-drv.h                     |  299 +++++++++
 include/linux/vcm.h                         |  275 ++++++++
 mm/Kconfig                                  |   30 +
 mm/Makefile                                 |    1 +
 mm/vcm.c                                    |  932 +++++++++++++++++++++++++++
 7 files changed, 2392 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/virtual-contiguous-memory.txt
 create mode 100644 include/linux/vcm-drv.h
 create mode 100644 include/linux/vcm.h
 create mode 100644 mm/vcm.c

diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
index f93e787..7c79ffa 100644
--- a/Documentation/00-INDEX
+++ b/Documentation/00-INDEX
@@ -344,6 +344,8 @@ video-output.txt
 	- sysfs class driver interface to enable/disable a video output device.
 video4linux/
 	- directory with info regarding video/TV/radio cards and linux.
+virtual-contiguous-memory.txt
+	- documentation on virtual contiguous memory manager framework.
 vm/
 	- directory with info on the Linux vm code.
 volatile-considered-harmful.txt
diff --git a/Documentation/virtual-contiguous-memory.txt b/Documentation/virtual-contiguous-memory.txt
new file mode 100644
index 0000000..0c0e90c
--- /dev/null
+++ b/Documentation/virtual-contiguous-memory.txt
@@ -0,0 +1,853 @@
+                                                             -*- org -*-
+
+This document covers how to use the Virtual Contiguous Memory Manager
+(VCMM), how the implementation works, and how to implement MMU drivers
+that can be plugged into VCMM.  It also contains a rationale for VCMM.
+
+* The Virtual Contiguous Memory Manager
+
+The VCMM was built to solve the system-wide memory mapping issues that
+occur when many bus-masters have IOMMUs.
+
+An IOMMU maps device addresses to physical addresses.  It also
+insulates the system from spurious or malicious device bus
+transactions and allows fine-grained mapping attribute control.  The
+Linux kernel core does not contain a generic API to handle IOMMU
+mapped memory; device driver writers must implement device specific
+code to interoperate with the Linux kernel core.  As the number of
+IOMMUs increases, coordinating the many address spaces mapped by all
+discrete IOMMUs becomes difficult without in-kernel support.
+
+The VCMM API enables device independent IOMMU control, virtual memory
+manager (VMM) interoperation and non-IOMMU enabled device
+interoperation by treating devices with or without IOMMUs and all CPUs
+with or without MMUs, their mapping contexts and their mappings using
+common abstractions.  Physical hardware is given a generic device type
+and mapping contexts are abstracted into Virtual Contiguous Memory
+(VCM) regions.  Users "reserve" memory from VCMs and "bind" their
+reservations with physical memory.
+
+If drivers limit their use of VCM contexts to a some subset of VCMM
+functionality, they can work with no changes with or without MMU.
+
+** Why the VCMM is Needed
+
+Driver writers who control devices with IOMMUs must contend with
+device control and memory management.  Driver writers have a large
+device driver API that they can leverage to control their devices, but
+they are lacking a unified API to help them program mappings into
+IOMMUs and share those mappings with other devices and CPUs in the
+system.
+
+Sharing is complicated by Linux's CPU-centric VMM.  The CPU-centric
+model generally makes sense because average hardware only contains
+a MMU for the CPU and possibly a graphics MMU.  If every device in the
+system has one or more MMUs the CPU-centric memory management (MM)
+programming model breaks down.
+
+Abstracting IOMMU device programming into a common API has already
+begun in the Linux kernel.  It was built to abstract the difference
+between AMD and Intel IOMMUs to support x86 virtualization on both
+platforms.  The interface is listed in include/linux/iommu.h.  It
+contains interfaces for mapping and unmapping as well as domain
+management.  This interface has not gained widespread use outside the
+x86; PA-RISC, Alpha and SPARC architectures and ARM and PowerPC
+platforms all use their own mapping modules to control their IOMMUs.
+The VCMM contains an IOMMU programming layer, but since its
+abstraction supports map management independent of device control, the
+layer is not used directly.  This higher-level view enables a new
+kernel service, not just an IOMMU interoperation layer.
+
+** The General Idea: Map Management using Graphs
+
+Looking at mapping from a system-wide perspective reveals a general
+graph problem.  The VCMM's API is built to manage the general mapping
+graph.  Each node that talks to memory, either through an MMU or
+directly (physically mapped) can be thought of as the device-end of
+a mapping edge.  The other edge is the physical memory (or
+intermediate virtual space) that is mapped.  The figure below shows
+an example three with CPU and a few devices connected to the memory
+directly or through a MMU.
+
++--------------------------------------------------------------------+
+|                               Memory                               |
++--------------------------------------------------------------------+
+                                  |
+   +------------------+-----------+-------+----------+-----------+
+   |                  |                   |          |           |
++-----+            +-----+             +-----+  +--------+  +--------+
+| MMU |            | MMU |             | MMU |  | Device |  | Device |
++-----+            +-----+             +-----+  +--------+  +--------+
+   |                  |                   |
++-----+       +-------+---+-----....   +-----+
+| CPU |       |           |            | GPU |
++-----+  +--------+  +--------+        +-----+
+         | Device |  | Device |  ...
+         +--------+  +--------+
+
+For each MMU in the system a VCM context is created through an through
+which drivers can make reservations and bind virtual addresses to
+physical space.  In the direct-mapped case the device is assigned
+a one-to-one MMU (as shown on the figure below). This scheme allows
+direct mapped devices to participate in general graph management.
+
++--------------------------------------------------------------------+
+|                               Memory                               |
++--------------------------------------------------------------------+
+                                  |
+   +------------------+-----------+-------+----------------+
+   |                  |                   |                |
++-----+            +-----+             +-----+      +------------+
+| MMU |            | MMU |             | MMU |      | One-to-One |
++-----+            +-----+             +-----+      +------------+
+   |                  |                   |                |
++-----+       +-------+---+-----....   +-----+       +-----+-----+
+| CPU |       |           |            | GPU |       |           |
++-----+  +--------+  +--------+        +-----+  +--------+  +--------+
+         | Device |  | Device |  ...            | Device |  | Device |
+         +--------+  +--------+                 +--------+  +--------+
+
+The CPU nodes can also be brought under the same mapping abstraction
+with the use of a light overlay on the existing VMM. This light
+overlay allows VCMM-managed mappings to interoperate with the common
+API.  The light overlay enables this without substantial modifications
+to the existing VMM.
+
+In addition to CPU nodes that are running Linux (and the VMM), remote
+CPU nodes that may be running other operating systems can be brought
+into the general abstraction.  Routing all memory management requests
+from a remote node through the central memory management framework
+enables new features like system-wide memory migration.  This feature
+may only be feasible for large buffers that are managed outside of the
+fast-path, but having remote allocation in a system enables features
+that are impossible to build without it.
+
+The fundamental objects that support graph-based map management are:
+Virtual Contiguous Memory contexts, reservations, and physical memory
+allocations.
+
+* Usage Overview
+
+In a nutshell, platform initialises VCM context for each MMU on the
+system and possibly one-to-one VCM contexts which are passed to device
+drivers.  Later on, drivers make reservation of virtual address space
+from the VCM context.  At this point no physical memory has been
+committed to the reservation.  To bind physical memory with a
+reservation, physical memory is allocated (possibly discontiguous) and
+then bound to the reservation.
+
+Single physical allocation can be bound to several different
+reservations also from different VCM contexts.  This allows for
+devices connected through different MMUs (or directly) to the memory
+banks to share physical memory buffers; this also lets it possible to
+map such memory into CPU's address space (be it kernel or user space)
+so that the same data can be accessed by the CPU.
+
+[[file:../include/linux/vcm.h][include/linux/vcm.h]] includes comments documenting each API.
+
+** Virtual Contiguous Memory context
+
+A Virtual Contiguous Memory context (VCM) abstracts an address space
+a device sees.  A VCM is created with a VCM driver dependent call.  It
+is destroyed with a call to:
+
+        void vcm_destroy(struct vcm *vcm);
+
+The newly created VCM instance can be passed to any function that needs to
+operate on or with a virtual contiguous memory region.  All internals
+of the VCM driver and how the mappings are handled is hidden and VCM
+driver dependent.
+
+** Bindings
+
+If all that driver needs is allocate some physical space and map it
+into its address space, a vcm_make_binding() call can be used:
+
+	struct vcm_res	*__must_check
+	vcm_make_binding(struct vcm *vcm, resource_size_t size,
+			 unsigned alloc_flags, unsigned res_flags);
+
+This call allocates physical memory, reserves virtual address space
+and binds those together.  If all those succeeds a reservation is
+returned which has physical memory associated with it.
+
+If driver does not require more complicated VCMM functionality, it is
+desirable to use this function since it will work on both real MMUs
+and one-to-one mappings.
+
+To destroy created binding, vcm_destroy_binding() can be used:
+
+        void vcm_destroy_binding(struct vcm_res *res);
+
+** Physical memory
+
+Physical memory allocations are handled using the following functions:
+
+	struct vcm_phys *__must_check
+	vcm_alloc(struct vcm *vcm, resource_size_t size, unsigned flags);
+
+	void vcm_free(struct vcm_phys *phys);
+
+It is noteworthy that physical space allocation is done in the context
+of a VCM.  This is especially important in case of one-to-one VCM
+contexts which cannot handle discontiguous physical memory.
+
+Also, depending on VCM context, the physical space may be allocated in
+parts of different sizes.  For instance, if a given MMU supports
+16MiB, 1MiB, 64KiB and 4KiB pages, it is likely that vcm_alloc() in
+context of this MMU's driver will try to split into as few as possible
+parts of those sizes.
+
+In case of one-to-one VCM contexts, a physical memory allocated with
+the call to vcm_alloc() may be usable only with vcm_map() function.
+
+** Mappings
+
+The easiest way to map a physical space into virtual address space
+represented by VCM context is to use the vcm_map() function:
+
+	struct vcm_res *__must_check
+	vcm_map(struct vcm *vcm, struct vcm_phys *phys, unsigned flags);
+
+This functions reserves address space from VCM context and binds
+physical space to it.  To reverse the process vcm_unmap() can be used:
+
+	void vcm_unmap(struct vcm_res *res);
+
+Similarly to vcm_make_binding(), Usage vcm_map() may be advantageous
+over the use of vcm_reserve() followed by vcm_bind().  This is not
+only true for one-to-one mapping but if it so happens that the call to
+vcm_map() request mapping of a physically contiguous space into kernel
+space, a direct mapping can be returned instead of creating a new one.
+
+In some cases, a reservation created with vcm_map() can be used only
+with the physical memory passed as the argument to vcm_map() (so if
+user chooses to call vcm_unbind() and then vcm_bind() on a different
+physical memory, the call may fail).
+
+** Reservations
+
+A reservation is a contiguous region allocated from a virtual address
+space represented by VCM context.  Just after reservation is created,
+no physical memory needs to be is bound to it.  To manage reservations
+following two functions are provided:
+
+	struct vcm_res *__must_check
+	vcm_reserve(struct vcm *vcm, resource_size_t size,
+		    unsigned flags);
+
+	void vcm_unreserve(struct vcm_res *res);
+
+The first one creates a reservation of desired size, and the second
+one destroys it.
+
+** Binding memory
+
+To bind a physical memory into a reservation vcm_bind() function is
+used:
+
+	int __must_check vcm_bind(struct vcm_res *res,
+				  struct vcm_phys *phys);
+
+When the binding is no longer needed, vcm_unbind() destroys the
+connection:
+
+	struct vcm_phys *vcm_unbind(struct vcm_res *res);
+
+** Activating mappings
+
+Unless a VCM context is activated, none of the bindings are actually
+guaranteed to be available.  When device driver needs the mappings
+it need to call vcm_activate() function to guarantee that the mappings
+are sent to hardware MMU.
+
+	int  __must_check vcm_activate(struct vcm *vcm);
+
+After VCM context is activated all further bindings (made with
+vcm_make_binding(), vcm_map() or vcm_bind()) will be updated so there
+is no need to call vcm_activate() after each binding is done or
+undone.
+
+To deactivate the VCM context vcm_deactivate() function is used:
+
+	void vcm_deactivate(struct vcm *vcm);
+
+Both of those functions can be called several times if all calls to
+vcm_activate() are paired with a later call to vcm_deactivate().
+
+** Device driver example
+
+The following is a simple, untested example of how platform and
+devices work together to use the VCM framework.  Platform initialises
+contexts for each MMU in the systems, and through platform device data
+passes them to correct drivers.
+
+Device driver header file:
+
+	struct foo_platform_data {
+		/* ... */
+		struct vcm	*vcm;
+		/* ... */
+	};
+
+Platform code:
+
+	static int plat_bar_vcm_init(void)
+	{
+		struct foo_platform_data *fpdata;
+		struct vcm *vcm;
+
+		vcm = vcm_baz_create(...);
+		if (IS_ERR(vcm))
+			return PTR_ERR(vcm);
+
+		fpdata = dev_get_platdata(&foo_device.dev);
+		fpdata->vcm = vcm;
+
+		/* ... */
+
+		return 0;
+	}
+
+Device driver implementation:
+
+	struct foo_private {
+		/* ... */
+		struct vcm_res	*fw;
+		/* ... */
+	};
+
+	static inline struct vcm_res *__must_check
+	__foo_alloc(struct device *dev, size_t size)
+	{
+		struct foo_platform_data *pdata =
+			dev_get_platdata(dev);
+		return vcm_make_binding(pdata->vcm, size, 0, 0);
+	}
+
+	static inline void __foo_free(struct vcm_res *res)
+	{
+		vcm_destroy_binding(res);
+	}
+
+	static int foo_probe(struct device *dev)
+	{
+		struct foo_platform_data *pdata =
+			dev_get_platdata(dev);
+		struct foo_private *priv;
+
+		if (IS_ERR_OR_NULL(pdata->vcm))
+			return pdata->vcm ? PTR_ERR(pdata->vcm) : -EINVAL;
+
+		priv = kzalloc(sizeof *priv, GFP_KERNEL);
+		if (!priv)
+			return -ENOMEM;
+
+		/* ... */
+
+		priv->fw = __foo_alloc(dev, 1 << 20);
+		if (IS_ERR(priv->fw)) {
+			kfree(priv);
+			return PTR_ERR(priv->fw);
+		}
+		/* copy firmware to fw */
+
+		vcm_activate(pdata->vcm);
+
+		dev->p = priv;
+
+		return 0;
+	}
+
+	static int foo_remove(struct device *dev)
+	{
+		struct foo_platform_data *pdata =
+			dev_get_platdata(dev);
+		struct foo_private *priv = dev->p;
+
+		/* ... */
+
+		vcm_deactivate(pdata->vcm);
+		__foo_free(priv->fw);
+
+		kfree(priv);
+
+		return 0;
+	}
+
+	static int foo_do_something(struct device *dev, /* ... */)
+	{
+		struct foo_platform_data *pdata =
+			dev_get_platdata(dev);
+		struct vcm_res *buf;
+		int ret;
+
+		buf = __foo_alloc(/* ... size ...*/);
+		if (IS_ERR(buf))
+			return ERR_PTR(buf);
+
+		/*
+		 * buf->start is address visible from device's
+		 * perspective.
+		 */
+
+		/* ... set hardware up ... */
+
+		/* ... wait for completion ... */
+
+		__foo_free(buf);
+
+		return ret;
+	}
+
+In the above example only vcm_make_binding() function is used so that
+the above scheme will work not only for systems with MMU but also in
+case of one-to-one VCM context.
+
+** IOMMU, one-to-one and VMM contexts
+
+The following example demonstrates mapping IOMMU, one-to-one and VMM
+reservations to the same physical memory.  For readability, error
+handling is not shown on the listings.
+
+First, each contexts needs to be created.  A call used for creating
+context is dependent on the driver used.  The following is just an
+example of how this could look like:
+
+	struct vcm *vcm_vmm, *vcm_onetoone, *vcm_iommu;
+
+	vcm_vmm      = vcm_vmm_create();
+	vcm_onetoone = vcm_onetoone_create();
+	vcm_iommu    = vcm_foo_mmu_create();
+
+Once contexts are created, physical space needs to be allocated,
+reservations made on each context and physical memory mapped to those
+reservations.  Because there is a one-to-one context, the memory has
+to be allocated from its context.  It's also best to map the memory in
+the single call using vcm_make_binding():
+
+	struct vcm_res *res_onetoone;
+
+	res_onetoone = vcm_make_binding(vcm_o2o, SZ_2MB | SZ_4K, 0, 0);
+
+What's left is map the space in the other two contexts.  If the
+reservation in the other two contexts won't be used for any other
+purpose then to reference the memory allocated in above, it's best to
+use vcm_map():
+
+	struct vcm_res *res_vcm, *res_iommu;
+
+	res_vmm = vcm_map(vcm_vmm, res_onetoone->phys, 0);
+	res_iommu = vcm_map(vcm_iommu, res_onetoone->phys, 0);
+
+Once the bindings have been created, the contexts need to be activated
+to make sure that they are actually on the hardware. (In case of
+one-to-one mapping it's most likely a no-operation but it's still
+required by the VCMM API so it must not be omitted.)
+
+	vcm_activate(vcm_vmm);
+	vcm_activate(vcm_onetoone);
+	vcm_activate(vcm_iommu);
+
+At this point, all three reservations represent addresses in
+respective address space that is bound to a physical memory.  Not only
+CPU can access it now but also devices connected through the MMU, as
+well as devices connected directly to the memory banks.  The bus
+address for the devices and virtual address for the CPU is available
+through the 'start' member of the vcm_res structure (ie. res_* objects
+above).
+
+Once the mapping is no longer used and memory no longer needed it can
+be freed as follows:
+
+	vcm_unmap(res_vmm);
+	vcm_unmap(res_iommu);
+	vcm_destroy_binding(res_onetoone);
+
+If the contexts are not needed either, they can be disabled:
+
+	vcm_deactivate(vcm_vmm);
+	vcm_deactivate(vcm_iommu);
+	vcm_deactivate(vcm_onetoone);
+
+and than, even destroyed:
+
+	vcm_destroy(vcm_vmm);
+	vcm_destroy(vcm_iommu);
+	vcm_destroy(vcm_onetoone);
+
+* Available drivers
+
+The following VCM drivers are provided:
+
+** Real hardware drivers
+
+There are no real hardware drivers at this time.
+
+** One-to-One drivers
+
+As it has been noted, one-to-One drivers are limited in the sense that
+certain operations are very unlikely to succeed.  In fact, it is often
+certain that some operations will fail.  If your driver needs to be
+able to run with One-to-One driver you should limit operations to:
+
+	vcm_make_binding()
+	vcm_destroy_binding()
+
+under some conditions, vcm_map() may also work.
+
+There are no One-to-One drivers at this time.
+
+* Writing a VCM driver
+
+The core of VCMM does not handle communication with the MMU.  For this
+purpose a VCM driver is used.  Its purpose is to manage virtual
+address space reservations, physical allocations as well as updating
+mappings in the hardware MMU.
+
+API designed for VCM drivers is described in the
+[[file:../include/linux/vcm-drv.h][include/linux/vcm-drv.h]] file so it might be a good idea to take a look
+inside.
+
+VCMM provides API for three different kinds of drivers.  The most
+basic is a core VCM which VCMM use directly.  Other then that, VCMM
+provides two wrappers -- VCM MMU and VCM One-to-One -- which can be
+used to create drivers for real hardware VCM contexts and for
+One-to-One contexts.
+
+All of the drivers need to provide a context creation functions which
+will allocate memory, fill start address, size and pointer to driver
+operations, and then call an init function which fills rest of the
+fields and validates entered values.
+
+** Writing a core VCM driver
+
+The core driver needs to provide a context creation function as well
+as at least some of the following operations:
+
+	void (*cleanup)(struct vcm *vcm);
+
+	int (*alloc)(struct vcm *vcm, resource_size_t size,
+		     struct vcm_phys **phys, unsigned alloc_flags,
+		     struct vcm_res **res, unsigned res_flags);
+	struct vcm_res *(*res)(struct vcm *vcm, resource_size_t size,
+			       unsigned flags);
+	struct vcm_phys *(*phys)(struct vcm *vcm, resource_size_t size,
+				 unsigned flags);
+
+	void (*unreserve)(struct vcm_res *res);
+
+	struct vcm_res *(*map)(struct vcm *vcm, struct vcm_phys *phys,
+			       unsigned flags);
+	int (*bind)(struct vcm_res *res, struct vcm_phys *phys);
+	void (*unbind)(struct vcm_res *res);
+
+	int (*activate)(struct vcm *vcm);
+	void (*deactivate)(struct vcm *vcm);
+
+All of the operations (expect for the alloc) may assume that all
+pointer arguments are not-NULL.  (In case of alloc, if any argument is
+NULL it is either phys or res (never both).)
+
+*** Context creation
+
+To use a VCM driver a VCM context has to be provided which is bound to
+the driver.  This is done by a driver-dependent call defined in it's
+header file.  Such a call may take varyous arguments to configure the
+context of the MMU.  Its prototype may look as follows:
+
+	struct vcm *__must_check vcm_samp_create(/* ... */);
+
+The driver will most likely define a structure encapsulating the vcm
+structure (in the usual way).  The context creation function must
+allocate space for such a structure and initialise it correctly
+including all members of the vcm structure expect for activations.
+The activations member is initialised by calling:
+
+	struct vcm *__must_check vcm_init(struct vcm *vcm);
+
+This function also validates that all fields are set correctly.
+
+The driver field of the vcm structure must point to a structure with
+all operations supported by the driver.
+
+If everything succeeds, the function has to return pointer to the vcm
+structure inside the encapsulating structure.  It is the pointer that
+will be passed to all of the driver's operations.  On error,
+a pointer-error must be returned (ie. not NULL).
+
+The function might look something like the following:
+
+	struct vcm *__must_check vcm_foo_create(/* ... */)
+	{
+		struct vcm_foo *foo;
+		struct vcm *vcm;
+
+		foo = kzalloc(sizeof *foo, GFP_KERNEL);
+		if (!foo)
+			return ERR_PTR(-ENOMEM);
+
+		/* ... do stuff ... */
+
+		foo->vcm.start  = /* ... */;
+		foo->vcm.size   = /* ... */;
+		foo->vcm.driver = &vcm_foo_driver;
+
+		vcm = vcm_init(&foo->vcm);
+		if (IS_ERR(vcm)) {
+			/* ... error recovery ... */
+			kfree(foo);
+		}
+		return vcm;
+	}
+
+*** Cleaning up
+
+The cleanup operation is called when the VCM context is destroyed.
+Its purpose is to free all resources acquired when VCM context was
+created including the space for the context structure.  If it is not
+given, the memory is freed using the kfree() function.
+
+*** Allocation and reservations
+
+If alloc operation is specified, res and phys operations are ignored.
+The observable behaviour of the alloc operation should mimic as
+closely as possible res and phys operations called one after the
+other.
+
+The reason for this operation is that in case of one-to-one VCM
+contexts, the driver may not be able to bind together arbitrary
+reservation with an arbitrary physical space.  In one-to-one contexts,
+reservations and physical memory are tight together and need to be
+made at the same time to make binding possible.
+
+The alloc operation may be called with both, res and phys being set,
+or at most one of them being NULL.
+
+The res operation reserves virtual address space in the VCM context.
+The function must set the start and res_size members of the vcm_res
+structure -- all other fields are filled by the VCMM framework.
+
+The phys operation allocates physical space which can later be bound
+to the reservation.  Unless VCM driver needs some special handling of
+physical memory, the vcm_phys_alloc() function can be used:
+
+	struct vcm_phys *__must_check
+	vcm_phys_alloc(resource_size_t size, unsigned flags,
+		       const unsigned char *orders);
+
+The last argument of this function (orders) is an array of orders of
+page sizes that function should try to allocate.  This array must be
+sorted from highest order to lowest and the last entry must be zero.
+
+For instance, an array { 8, 4, 0 } means that the function should try
+and allocate 1MiB, 64KiB and 4KiB pages (this is assuming PAGE_SIZE is
+4KiB which is true for all supported architectures).  For example, if
+requested size is 2MiB and 68 KiB, the function will try to allocate
+two 1MiB pages, one 64KiB page and one 4KiB page.  This may be useful
+when the mapping is written to the MMU since the largest possible
+pages will be used reducing the number of entries.
+
+If phys or alloc callback chooses to allocate physical memory on its
+own, it must provide a free callback along with the vcm_phys
+structure.  The purpose of the callback is, as one may imagine, to
+free allocated space.
+
+All those operations may assume that size is a non-zero and divisible
+by PAGE_SIZE.
+
+*** Binding
+
+The map operation is optional and it joins res and bind operations
+together.  Like alloc operation, this is provided because in case of
+one-to-one mappings, the VCM driver may be unable to bind together
+physical space with an arbitrary reservation.
+
+Moreover, in case of some VCM drivers, a mapping for given physical
+memory can already be present (ie. in case of using VMM).
+
+Reservation created with map operation does not have to be usable
+with any other physical space then the one provided when reservation
+was created.
+
+The bind operation binds given reservation with a given physical
+memory.  The operation may assume that reservation given as an
+argument is not bound to any physical memory.
+
+Whichever of the two operation is used, the binding must be reflected
+on the hardware if the VCM context has been activated.  If VCM context
+has not been activated this is not required.
+
+The vcm_map() function uses map operation if one is provided.
+Otherwise, it falls back to alloc or res operation followed by bind
+operation.  If this is also not possible, -EOPNOTSUPP is returned.
+Similarly, vcm_bind() function uses the bind operation unless it is
+not provided in which case -EOPNOTSUPP is returned.
+
+Also, if alloc operation is not provided but map is, the
+vcm_make_binding() function will use phys and map operations.
+
+*** Freeing resources
+
+The unbind callback removes the binding between reservation and
+a physical memory.  If unbind operation is not provided, VCMM assumes
+that it is a no-operation.
+
+The unreserve callback releases a reservation as well as free
+allocated space for the vcm_res structure.  It is required and if it
+is not provided vcm_unreserve() will generate a warning.
+
+*** Activation
+
+When VCM context is activated, the activate callback is called.  It is
+called only once even if vcm_activate() is called several times on the
+same context.
+
+When VCM context is deactivated (that is, if for each call to
+vcm_activate(), vcm_deactivate() was called) the deactivate callback
+is called.
+
+When VCM context is activated, all bound reservations must be
+reflected on the hardware MMU (if any).  Also, ofter activation, all
+calls to vcm_bind(), vcm_map() or vcm_make_binding() must
+automatically reflect new mappings on the hardware MMU.
+
+Neither of the operations are required and if missing, VCMM will
+assume they are a no-operation and no warning will be generated.
+
+** Writing a hardware MMU driver
+
+It may be undesirable to implement all of the operations that are
+required to create a usable driver.  In case of hardware MMUs a helper
+wrapper driver has been created to make writing real drivers as simple
+as possible.
+
+The wrapper implements most of the functionality of the driver leaving
+only implementation of the actual talking to the hardware MMU in hands
+of programmer.  Reservations managements as general housekeeping is
+already there.
+
+If you want to use this wrapper, you need to select VCM_MMU Kconfig
+option.
+
+*** Context creation
+
+Similarly to normal drivers, MMU driver needs to provide a context
+creation function.  Such a function must provide a vcm_mmu object and
+initialise vcm.start, vcm.size and driver fields of the structure.
+When this is done, vcm_mmu_init() should be called which will
+initialise the rest of the fields and validate entered values:
+
+	struct vcm *__must_check vcm_mmu_init(struct vcm_mmu *mmu);
+
+This is, in fact, very similar to the way standard driver is created.
+
+*** Orders
+
+One of the fields of the vcm_mmu_driver structure is orders.  This is
+an array of orders of pages supported by the hardware MMU.  It must be
+sorted from largest to smallest and zero terminated.
+
+The order is the logarithm with the base two of the size of supported
+page size divided by PAGE_SIZE.  For instance, { 8, 4, 0 } means that
+MMU supports 1MiB, 64KiB and 4KiB pages.
+
+*** Operations
+
+The three operations that MMU wrapper driver uses are:
+
+	void (*cleanup)(struct vcm *vcm);
+
+	int (*activate)(struct vcm_res *res, struct vcm_phys *phys);
+	void (*deactivate)(struct vcm_res *res, struct vcm_phys *phys);
+
+	int (*activate_page)(dma_addr_t vaddr, dma_addr_t paddr,
+			     unsigned order, void *vcm),
+	int (*deactivate_page)(dma_addr_t vaddr, dma_addr_t paddr,
+			       unsigned order, void *vcm),
+
+The first one frees all resources allocated by the context creation
+function (including the structure itself).  If this operation is not
+given, kfree() will be called on vcm_mmu structure.
+
+The activate and deactivate operations are required and they are used
+to update mappings in the MMU.  Whenever binding is activated or
+deactivated the respective operation is called.
+
+To divide mapping into physical pages, vcm_phys_walk() function can be
+used:
+
+	int vcm_phys_walk(dma_addr_t vaddr, const struct vcm_phys *phys,
+			  const unsigned char *orders,
+			  int (*callback)(dma_addr_t vaddr, dma_addr_t paddr,
+					  unsigned order, void *priv),
+			  int (*recovery)(dma_addr_t vaddr, dma_addr_t paddr,
+					  unsigned order, void *priv),
+			  void *priv);
+
+It start from given virtual address and tries to divide allocated
+physical memory to as few pages as possible where order of each page
+is one of the orders specified by orders argument.
+
+It may be easier to implement activate_page and deactivate_page
+operations instead thought.  They are called on each individual page
+rather then the whole mapping.  It basically incorporates call to the
+vcm_phys_walk() function so driver does not need to call it
+explicitly.
+
+** Writing a one-to-one VCM driver
+
+
+
+Similarly to a wrapper for a real hardware MMU a wrapper for
+one-to-one VCM contexts has been created.  It implements all of the
+houskeeping operations and leaves only contiguous memory management
+(that is allocating and freeing contiguous regions).
+
+*** Context creation
+
+As with other drivers, one-to-one driver needs to provide a context
+creation function.  It needs to allocate space for vcm_o2o structure
+and initialise its vcm.start, vcm.end and driver fields.  Calling
+vcm_o2o_init() will fill the other fields and validate entered values:
+
+	struct vcm *__must_check vcm_o2o_init(struct vcm_o2o *o2o);
+
+There are the following three operations used by the wrapper:
+
+	void (*cleanup)(struct vcm *vcm);
+	void *(*alloc)(struct vcm *vcm, struct vcm_phys_part *part,
+		       unsigned flags);
+	void (*free)(struct vcm_phys_part *part, void *priv);
+
+The cleanup operation cleans the context and frees all resources.  If
+not provided, kfree() is used.
+
+The alloc operation allocates physically contiguous memory.  The size
+of requested block is saved in the provided vcm_phys_part structure.
+Inside this structure, the operation must save the physical address of
+the allocated block if allocation succeeds.  On error, the operation
+must return an error-pointer (NULL is treated as success).
+
+The free operation frees the physically contiguous memory that has
+been allocated with alloc.  As arguments, it is given the same
+vcm_phy_part structure that alloc initialised as well as a priv
+argument which is the value returned by alloc operation.
+
+Both, alloc and free, operations are required.
+
+If you want to use this wrapper, you need to select VCM_O2O Kconfig
+option.
+
+* Epilogue
+
+The initial version of the VCMM framework was written by Zach Pfeffer
+<zpfeffer@codeaurora.org>.  It was then redesigned and mostly
+rewritten by Michal Nazarewicz <m.nazarewicz@samsung.com>.
+
+The new version is still lacking a few important features.  Most
+notably, no real hardware MMU has been implemented yet.  This may be
+ported from original Zach's proposal.
+
+Also, support for VMM is lacking.  This is another thing that can be
+ported from Zach's proposal.
diff --git a/include/linux/vcm-drv.h b/include/linux/vcm-drv.h
new file mode 100644
index 0000000..8c1862e
--- /dev/null
+++ b/include/linux/vcm-drv.h
@@ -0,0 +1,299 @@
+/*
+ * Virtual Contiguous Memory driver API header
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#ifndef __LINUX_VCM_DRV_H
+#define __LINUX_VCM_DRV_H
+
+#include <linux/vcm.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
+
+#include <asm/atomic.h>
+
+/**
+ * struct vcm_driver - the MMU driver operations.
+ * @cleanup:	called when vcm objects is destroyed; if omitted,
+ *		kfree() will be used.
+ * @alloc:	callback function for allocating physical memory and
+ *		reserving virtual address space; XXX FIXME: document;
+ *		if set, @res and @alloc are ignored.
+ * @res:	creates a reservation of virtual address space; XXX FIXME:
+ *		document; if @alloc is provided this is ignored.
+ * @res:	allocates a physical memory; XXX FIXME: document; if @alloc
+ *		is provided this is ignored.
+ * @unreserve:	destroys a virtual address space reservation created by @alloc;
+ *		required.
+ * @map:	reserves address space and binds a physical memory to it.
+ * @bind:	binds a physical memory to a reserved address space.
+ * @unbind:	unbinds a physical memory from reserved address space.
+ * @activate:	activates the context making all bindings active; once
+ *		the context has been activated, this callback is not
+ *		called again until context is deactivated and
+ *		activated again (so if user calls vcm_activate()
+ *		several times only the first call in sequence will
+ *		invoke this callback).
+ * @deactivate:	deactivates the context making all bindings inactive;
+ *		call this callback always accompanies call to the
+ *		@activate callback.
+ */
+struct vcm_driver {
+	void (*cleanup)(struct vcm *vcm);
+
+	int (*alloc)(struct vcm *vcm, resource_size_t size,
+		     struct vcm_phys **phys, unsigned alloc_flags,
+		     struct vcm_res **res, unsigned res_flags);
+	struct vcm_res *(*res)(struct vcm *vcm, resource_size_t size,
+			       unsigned flags);
+	struct vcm_phys *(*phys)(struct vcm *vcm, resource_size_t size,
+				 unsigned flags);
+
+	void (*unreserve)(struct vcm_res *res);
+
+	struct vcm_res *(*map)(struct vcm *vcm, struct vcm_phys *phys,
+			       unsigned flags);
+	int (*bind)(struct vcm_res *res, struct vcm_phys *phys);
+	void (*unbind)(struct vcm_res *res);
+
+	int (*activate)(struct vcm *vcm);
+	void (*deactivate)(struct vcm *vcm);
+};
+
+/**
+ * struct vcm_phys - representation of allocated physical memory.
+ * @count:	number of contiguous parts the memory consists of; if this
+ *		equals one the whole memory block is physically contiguous;
+ *		read only.
+ * @size:	total size of the allocated memory; read only.
+ * @free:	callback function called when memory is freed; internal.
+ * @bindings:	how many virtual address space reservations this memory has
+ *		been bound to; internal.
+ * @parts:	array of @count parts describing each physically contiguous
+ *		memory block that the whole area consists of; each element
+ *		describes part's physical starting address in bytes
+ *		(@parts->start) and its size in bytes (@parts->size); read
+ *		only.
+ */
+struct vcm_phys {
+	unsigned		count;
+	resource_size_t		size;
+
+	void (*free)(struct vcm_phys *phys);
+	atomic_t		bindings;
+
+	struct vcm_phys_part {
+		dma_addr_t	start;
+		resource_size_t	size;
+	} parts[0];
+};
+
+/**
+ * vcm_init() - initialises VCM context structure.
+ * @vcm:	the VCM context to initialise.
+ *
+ * This function initialises the vcm structure created by a MMU driver
+ * when setting things up.  It sets up all fields of the vcm structure
+ * expect for @vcm->start, @vcm->size and @vcm->driver which are
+ * validated by this function.  If they have invalid value function
+ * produces warning and returns an error-pointer.  If everything is
+ * fine, @vcm is returned.
+ */
+struct vcm *__must_check vcm_init(struct vcm *vcm);
+
+#ifdef CONFIG_VCM_MMU
+
+struct vcm_mmu;
+
+/**
+ * struct vcm_mmu_driver - a driver used for real MMUs.
+ * @orders:	array of orders of pages supported by the MMU sorted from
+ *		the largest to the smallest.  The last element is always
+ *		zero (which means 4K page).
+ * @cleanup:	Function called when the VCM context is destroyed;
+ *		optional, if not provided, kfree() is used.
+ * @activate:	callback function for activating a single mapping; it's
+ *		role is to set up the MMU so that reserved address space
+ *		donated by res will point to physical memory donated by
+ *		phys; required unless @activate_page and @deactivate_page
+ *		are both provided
+ * @deactivate:	this reverses the effect of @activate; required unless
+ *		@deactivate_page is provided.
+ * @activate_page:	callback function for activating a single page; it is
+ *			ignored if @activate is provided; it's given a single
+ *			page such that its order (given as third argument) is
+ *			one of the supported orders specified in @orders;
+ *			required unless @activate is provided.
+ * @deactivate_page:	this reverses the effect of the @activate_page
+ *			callback; required unless @activate and @deactivate
+ *			are both provided.
+ */
+struct vcm_mmu_driver {
+	const unsigned char	*orders;
+
+	void (*cleanup)(struct vcm *vcm);
+	int (*activate)(struct vcm_res *res, struct vcm_phys *phys);
+	void (*deactivate)(struct vcm_res *res, struct vcm_phys *phys);
+	int (*activate_page)(dma_addr_t vaddr, dma_addr_t paddr,
+			     unsigned order, void *vcm);
+	int (*deactivate_page)(dma_addr_t vaddr, dma_addr_t paddr,
+			       unsigned order, void *vcm);
+};
+
+/**
+ * struct vcm_mmu - VCM MMU context
+ * @vcm:	VCM context.
+ * @driver:	VCM MMU driver's operations.
+ * @pool:	virtual address space allocator; internal.
+ * @bound_res:	list of bound reservations; internal.
+ * @mutex:	mutext protecting @bound_res; internal.
+ * @activated:	whether VCM context has been activated; internal.
+ */
+struct vcm_mmu {
+	struct vcm			vcm;
+	const struct vcm_mmu_driver	*driver;
+	/* internal */
+	struct gen_pool			*pool;
+	struct list_head		bound_res;
+	/* The mutex protects operations on bound_res list and list. */
+	struct mutex			mutex;
+	int				activated;
+};
+
+/**
+ * vcm_mmu_init() - initialises a VCM context for a real MMU.
+ * @mmu:	the vcm_mmu context to initialise.
+ *
+ * This function initialises the vcm_mmu structure created by a MMU
+ * driver when setting things up.  It sets up all fields of the
+ * structure expect for @mmu->vcm.start, @mmu.vcm->size and
+ * @mmu->driver which are validated by this function.  If they have
+ * invalid value function produces warning and returns an
+ * error-pointer.  On any other error, an error-pointer is returned as
+ * well.  If everything is fine, address of @mmu->vcm is returned.
+ */
+struct vcm *__must_check vcm_mmu_init(struct vcm_mmu *mmu);
+
+#endif
+
+#ifdef CONFIG_VCM_O2O
+
+/**
+ * struct vcm_o2o_driver - VCM One-to-One driver
+ * @cleanup:	cleans up the VCM context; if not specified. kfree() is used.
+ * @alloc:	physically contiguous memory allocator; the size of the
+ *		block to allocate is specified by part->size; the physical
+ *		address of the block must be returned in part->start;
+ *		on error must return an error-pointer, otherwise some
+ *		other pointer which will be passed to @free as priv;
+ *		required.
+ * @free:	physical memory freeing function; required.
+ */
+struct vcm_o2o_driver {
+	void (*cleanup)(struct vcm *vcm);
+	void *(*alloc)(struct vcm *vcm, struct vcm_phys_part *part,
+		       unsigned flags);
+	void (*free)(struct vcm_phys_part *part, void *priv);
+};
+
+/**
+ * struct vcm_o2o - VCM One-to-One context
+ * @vcm:	VCM context.
+ * @driver:	VCM One-to-One driver's operations.
+ */
+struct vcm_o2o {
+	struct vcm			vcm;
+	const struct vcm_o2o_driver	*driver;
+};
+
+/**
+ * vcm_mmu_init() - initialises a VCM context for a one-to-one context.
+ * @o2o:	the vcm_o2o context to initialise.
+ *
+ * This function initialises the vcm_o2o structure created by a O2O
+ * driver when setting things up.  It sets up all fields of the
+ * structure expect for @o2o->vcm.start, @o2o->vcm.size and
+ * @o2o->driver which are validated by this function.  If they have
+ * invalid value function produces warning and returns an
+ * error-pointer.  On any other error, an error-pointer is returned as
+ * well.  If everything is fine, address of @o2o->vcm is returned.
+ */
+struct vcm *__must_check vcm_o2o_init(struct vcm_o2o *o2o);
+
+#endif
+
+#ifdef CONFIG_VCM_PHYS
+
+/**
+ * vcm_phys_alloc() - allocates physical discontiguous space
+ * @size:	size of the block to allocate.
+ * @flags:	additional allocation flags; XXX FIXME: document
+ * @orders:	array of orders of pages supported by the MMU sorted from
+ *		the largest to the smallest.  The last element is always
+ *		zero (which means 4K page).
+ *
+ * This function tries to allocate a physical discontiguous space in
+ * such a way that it allocates the largest possible blocks from the
+ * sizes donated by the @orders array.  So if @orders is { 8, 0 }
+ * (which means 1MiB and 4KiB pages are to be used) and requested
+ * @size is 2MiB and 12KiB the function will try to allocate two 1MiB
+ * pages and three 4KiB pages (in that order).  If big page cannot be
+ * allocated the function will still try to allocate more smaller
+ * pages.
+ */
+struct vcm_phys *__must_check
+vcm_phys_alloc(resource_size_t size, unsigned flags,
+	       const unsigned char *orders);
+
+/**
+ * vcm_phys_walk() - helper function for mapping physical pages
+ * @vaddr:	virtual address to map/unmap physical space to/from
+ * @phys:	physical space
+ * @orders:	array of orders of pages supported by the MMU sorted from
+ *		the largest to the smallest.  The last element is always
+ *		zero (which means 4K page).
+ * @callback:	function called for each page.
+ * @recover:	function called for each page when @callback returns
+ *		negative number; if it also returns negative number
+ *		function terminates; may be NULL.
+ * @priv:	private data for the callbacks.
+ *
+ * This function walks through @phys trying to mach largest possible
+ * page size donated by @orders.  For each such page @callback is
+ * called.  If @callback returns negative number the function calls
+ * @recover for each page @callback was called successfully.
+ *
+ * So, for instance, if we have a physical memory which consist of
+ * 1Mib part and 8KiB part and @orders is { 8, 0 } (which means 1MiB
+ * and 4KiB pages are to be used), @callback will be called first with
+ * 1MiB page and then two times with 4KiB page.  This is of course
+ * provided that @vaddr has correct alignment.
+ *
+ * The idea is for hardware MMU drivers to call this function and
+ * provide a callbacks for mapping/unmapping a single page.  The
+ * function divides the region into pages that the MMU can handle.
+ *
+ * If @callback at one point returns a negative number this is the
+ * return value of the function; otherwise zero is returned.
+ */
+int vcm_phys_walk(dma_addr_t vaddr, const struct vcm_phys *phys,
+		  const unsigned char *orders,
+		  int (*callback)(dma_addr_t vaddr, dma_addr_t paddr,
+				  unsigned order, void *priv),
+		  int (*recovery)(dma_addr_t vaddr, dma_addr_t paddr,
+				  unsigned order, void *priv),
+		  void *priv);
+
+#endif
+
+#endif
diff --git a/include/linux/vcm.h b/include/linux/vcm.h
new file mode 100644
index 0000000..965dc9b
--- /dev/null
+++ b/include/linux/vcm.h
@@ -0,0 +1,275 @@
+/*
+ * Virtual Contiguous Memory header
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#ifndef __LINUX_VCM_H
+#define __LINUX_VCM_H
+
+#include <linux/kref.h>
+#include <linux/compiler.h>
+
+struct vcm_driver;
+struct vcm_phys;
+
+/**
+ * struct vcm - A virtually contiguous memory context.
+ * @start:	the smallest possible address available in this context.
+ * @size:	size of available address space in bytes; internal, read
+ *		only for MMU drivers.
+ * @activations:	How many times context was activated; internal,
+ *			read only for MMU drivers.
+ * @driver:	driver handling this driver; internal.
+ *
+ * This structure represents a context of virtually contiguous memory
+ * managed by a MMU pointed by the @mmu pointer.  This is the main
+ * structure used to interact with the VCM framework.
+ *
+ * Whenever driver wants to reserve virtual address space or allocate
+ * backing storage this pointer to this structure must be passed.
+ *
+ */
+struct vcm {
+	dma_addr_t		start;
+	resource_size_t		size;
+	atomic_t		activations;
+	const struct vcm_driver	*driver;
+};
+
+/**
+ * struct vcm_res - A reserved virtually contiguous address space.
+ * @start:	bus address of the region in bytes; read only.
+ * @bound_size:	number of bytes actually bound to the virtual address;
+ *		read only.
+ * @res_size:	size of the reserved address space in bytes; read only.
+ * @vcm:	VCM context; internal, read only for MMU drivers.
+ * @phys:	pointer to physical memory bound to this reservation; NULL
+ *		if no physical memory is bound; read only.
+ *
+ * This structure represents a portion virtually contiguous address
+ * space reserved for use with the driver.  Once address space is
+ * reserved a physical memory can be bound to it so that it will paint
+ * to real memory.
+ */
+struct vcm_res {
+	dma_addr_t		start;
+	resource_size_t		bound_size;
+	resource_size_t		res_size;
+
+	struct vcm		*vcm;
+	struct vcm_phys		*phys;
+};
+
+
+/**
+ * vcm_destroy() - destroys a VCM context.
+ * @vcm:	VCM to destroy.
+ */
+void vcm_destroy(struct vcm *vcm);
+
+/**
+ * vcm_make_binding() - allocates memory and binds it to virtual address space
+ * @vcm:	VCM context to reserve virtual address space in
+ * @size:	number of bytes to allocate; aligned up to a PAGE_SIZE
+ * @alloc_flags:	additional allocator flags; see vcm_alloc() for
+ *			description of those.
+ * @res_flags:	additional reservation flags; see vcm_reserve() for
+ *		description of those.
+ *
+ * This is a call that binds together three other calls:
+ * vcm_reserve(), vcm_alloc() and vcm_bind().  The purpose of this
+ * function is that on systems with no IO MMU separate calls to
+ * vcm_alloc() and vcm_reserve() may fail whereas when called together
+ * they may work correctly.
+ *
+ * This is a consequence of the fact that with no IO MMU the simulated
+ * virtual address must be the same as physical address, thus if first
+ * virtual address space were to be reserved and then physical memory
+ * allocated, both addresses may not match.
+ *
+ * With this call, a driver that simulates IO MMU may simply allocate
+ * a physical memory and when this succeeds create correct reservation.
+ *
+ * In short, if device drivers do not need more advanced MMU
+ * functionolities, they should limit themselves to this function
+ * since then the drivers may be easily ported to systems without IO
+ * MMU.
+ *
+ * To access the vcm_phys structure created by this call a phys field
+ * of returned vcm_res structure should be used.
+ *
+ * On error returns a pointer which yields true when tested with
+ * IS_ERR().
+ */
+struct vcm_res  *__must_check
+vcm_make_binding(struct vcm *vcm, resource_size_t size,
+		 unsigned alloc_flags, unsigned res_flags);
+
+/**
+ * vcm_map() - makes a reservation and binds physical memory to it
+ * @vcm:	VCM context
+ * @phys:	physical memory to bind.
+ * @flags:	additional flags; see vcm_reserve() for	description of
+ *		those.
+ *
+ * This is a call that binds together two other calls: vcm_reserve()
+ * and vcm_bind().  If all you need is reserve address space and
+ * bind physical memory it's better to use this call since it may
+ * create better mappings in some situations.
+ *
+ * Drivers may be optimised in such a way that it won't be possible to
+ * use reservation with a different physical memory.
+ *
+ * On error returns a pointer which yields true when tested with
+ * IS_ERR().
+ */
+struct vcm_res *__must_check
+vcm_map(struct vcm *vcm, struct vcm_phys *phys, unsigned flags);
+
+/**
+ * vcm_alloc() - allocates a physical memory for use with vcm_res.
+ * @vcm:	VCM context allocation is performed in.
+ * @size:	number of bytes to allocate; aligned up to a PAGE_SIZE
+ * @flags:	additional allocator flags; XXX FIXME: describe
+ *
+ * In case of some MMU drivers, the @vcm may be important and later
+ * binding (vcm_bind()) may fail if done on another @vcm.
+ *
+ * On success returns a vcm_phys structure representing an allocated
+ * physical memory that can be bound to reserved virtual address
+ * space.  On error returns a pointer which yields true when tested with
+ * IS_ERR().
+ */
+struct vcm_phys *__must_check
+vcm_alloc(struct vcm *vcm, resource_size_t size, unsigned flags);
+
+/**
+ * vcm_free() - frees an allocated physical memory
+ * @phys:	physical memory to free.
+ *
+ * If the physical memory is bound to any reserved address space it
+ * must be unbound first.  Otherwise a warning will be issued and
+ * the memory won't be freed causing memory leaks.
+ */
+void vcm_free(struct vcm_phys *phys);
+
+/**
+ * vcm_reserve() - reserves a portion of virtual address space.
+ * @vcm:	VCM context reservation is performed in.
+ * @size:	number of bytes to allocate; aligned up to a PAGE_SIZE
+ * @flags:	additional reservation flags; XXX FIXME: describe
+ * @alignment:	required alignment of the reserved space; must be
+ *		a power of two or zero.
+ *
+ * On success returns a vcm_res structure representing a reserved
+ * (contiguous) virtual address space that physical memory can be
+ * bound to (using vcm_bind()).  On error returns a pointer which
+ * yields true when tested with IS_ERR().
+ */
+struct vcm_res *__must_check
+vcm_reserve(struct vcm *vcm, resource_size_t size, unsigned flags);
+
+/**
+ * vcm_unreserve() - destroyers a virtual address space reservation
+ * @res:	reservation to destroy.
+ *
+ * If any physical memory is bound to the reserved address space it
+ * must be unbound first.  Otherwise it will be unbound and warning
+ * will be issued.
+ */
+void vcm_unreserve(struct vcm_res *res);
+
+/**
+ * vcm_bind() - binds a physical memory to virtual address space
+ * @res:	virtual address space to bind the physical memory.
+ * @phys:	physical memory to bind to the virtual addresses.
+ *
+ * The mapping won't be active unless vcm_activate() on the VCM @res
+ * was created in context of was called.
+ *
+ * If @phys is already bound to @res this function returns -EALREADY.
+ * If some other physical memory is bound to @res -EADDRINUSE is
+ * returned.  If size of the physical memory is larger then the
+ * virtual space -ENOSPC is returned.  In all other cases the physical
+ * memory is bound to the virtual address and on success zero is
+ * returned, on error a negative number.
+ */
+int  __must_check vcm_bind(struct vcm_res *res, struct vcm_phys *phys);
+
+/**
+ * vcm_unbind() - unbinds a physical memory from virtual address space
+ * @res:	virtual address space to unbind the physical memory from.
+ *
+ * This reverses the effect of the vcm_bind() function.  Function
+ * returns physical space that was bound to the reservation (or NULL
+ * if no space was bound in which case also a warning is issued).
+ */
+struct vcm_phys *vcm_unbind(struct vcm_res *res);
+
+/**
+ * vcm_destroy_binding() - destroys the binding
+ * @res:	a bound reserved address space to destroy.
+ *
+ * This function incorporates three functions: vcm_unbind(),
+ * vcm_free() and vcm_unreserve() (in that order) in one call.
+ */
+void vcm_destroy_binding(struct vcm_res *res);
+
+/**
+ * vcm_unmap() - unbinds physical memory and unreserves address space
+ * @res:	reservation to destroy
+ *
+ * This is a call that binds together two other calls: vcm_unbind()
+ * and vcm_unreserve().
+ */
+static inline void vcm_unmap(struct vcm_res *res)
+{
+	vcm_unbind(res);
+	vcm_unreserve(res);
+}
+
+/**
+ * vcm_activate() - activates bindings in VCM.
+ * @vcm:	VCM to activate bindings in.
+ *
+ * All of the bindings on the @vcm done before this function is called
+ * are inactive and do not take effect.  The call to this function
+ * guarantees that all bindings are sent to the hardware MMU (if any).
+ *
+ * After VCM is activated all bindings will be automatically updated
+ * on the hardware MMU, so there is no need to call this function
+ * after each vcm_bind()/vcm_unbind().
+ *
+ * Each call to vcm_activate() should be later accompanied by a call
+ * to vcm_deactivate().  Otherwise a warning will be issued when VCM
+ * context is destroyed (vcm_destroy()).  This function can be called
+ * several times.
+ *
+ * On success returns zero, on error a negative error code.
+ */
+int  __must_check vcm_activate(struct vcm *vcm);
+
+/**
+ * vcm_deactivate() - deactivates bindings in VCM.
+ * @vcm:	VCM to deactivate bindings in.
+ *
+ * This function reverts effect of the vcm_activate() function.  After
+ * calling this function caller has no guarantee that bindings defined
+ * in VCM are active.
+ *
+ * If this is called without calling the vcm_activate() warning is
+ * issued.
+ */
+void vcm_deactivate(struct vcm *vcm);
+
+#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index b410910..0445f68 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -371,3 +371,33 @@ config CMA_BEST_FIT
 	  the number of allocated regions and usually much smaller).  It
 	  allocates area from the smallest hole that is big enough for
 	  allocation in question.
+
+
+config VCM
+	bool "Virtual Contiguous Memory framework"
+	help
+	  This enables the Virtual Contiguous Memory framework which
+	  provides an abstraction for virtual address space provided by
+	  various MMUs present on the platform.
+
+	  The framework uses plugable MMU drivers for hardware MMUs and
+	  if drivers obeys some limitations it can be also used on
+	  platforms with no MMU.
+
+	  For more information see
+	  <Documentation/virtual-contiguous-memory.txt>.  If unsure, say
+	  "n".
+
+# Select it if you need vcm_mmu wrapper driver
+config VCM_MMU
+	select VCM_PHYS
+	select GENERIC_ALLOCATOR
+	bool
+
+# Select if you need vcm_o2o wrapper driver
+config VCM_O2O
+	bool
+
+# Select if you need vcm_phys_alloc() or vcm_phys_walk() functions
+config VCM_PHYS
+	bool
diff --git a/mm/Makefile b/mm/Makefile
index d8c717f..e908202 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -49,3 +49,4 @@ obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
 obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
 obj-$(CONFIG_CMA) += cma.o
 obj-$(CONFIG_CMA_BEST_FIT) += cma-best-fit.o
+obj-$(CONFIG_VCM) += vcm.o
diff --git a/mm/vcm.c b/mm/vcm.c
new file mode 100644
index 0000000..ef3d1a6
--- /dev/null
+++ b/mm/vcm.c
@@ -0,0 +1,932 @@
+/*
+ * Virtual Contiguous Memory core
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#include <linux/vcm-drv.h>
+#include <linux/module.h>
+#include <linux/mm.h>
+#include <linux/err.h>
+#include <linux/slab.h>
+#include <linux/genalloc.h>
+
+#include <asm/atomic.h>
+
+/******************************** Devices API *******************************/
+
+void vcm_destroy(struct vcm *vcm)
+{
+	if (WARN_ON(atomic_read(&vcm->activations)))
+		vcm->driver->deactivate(vcm);
+
+	if (vcm->driver->cleanup)
+		vcm->driver->cleanup(vcm);
+	else
+		kfree(vcm);
+}
+EXPORT_SYMBOL_GPL(vcm_destroy);
+
+static void
+__vcm_alloc_and_reserve(struct vcm *vcm, resource_size_t size,
+			struct vcm_phys **phys, unsigned alloc_flags,
+			struct vcm_res **res, unsigned res_flags)
+{
+	int ret, alloc;
+
+	if (WARN_ON(!vcm) || !size) {
+		ret = -EINVAL;
+		goto error;
+	}
+
+	size = PAGE_ALIGN(size);
+
+	if (vcm->driver->alloc) {
+		ret = vcm->driver->alloc(vcm, size,
+					 phys, alloc_flags, res, res_flags);
+		if (ret)
+			goto error;
+		alloc = 1;
+	} else if ((res && !vcm->driver->res) || (phys && !vcm->driver->phys)) {
+		ret = -EOPNOTSUPP;
+		goto error;
+	}
+
+	if (res) {
+		if (!alloc) {
+			*res = vcm->driver->res(vcm, size, res_flags);
+			if (IS_ERR(*res)) {
+				ret = PTR_ERR(*res);
+				goto error;
+			}
+		}
+		(*res)->bound_size = 0;
+		(*res)->vcm = vcm;
+		(*res)->phys = NULL;
+	}
+
+	if (phys) {
+		if (!alloc) {
+			*phys = vcm->driver->phys(vcm, size, alloc_flags);
+			if (IS_ERR(*phys)) {
+				vcm_unreserve(*res);
+				ret = PTR_ERR(*phys);
+				goto error;
+			}
+		}
+		atomic_set(&(*phys)->bindings, 0);
+		WARN_ON(!(*phys)->free);
+	}
+
+	return;
+
+error:
+	if (phys)
+		*phys = ERR_PTR(ret);
+	if (res)
+		*res = ERR_PTR(ret);
+}
+
+struct vcm_res *__must_check
+vcm_make_binding(struct vcm *vcm, resource_size_t size,
+		 unsigned alloc_flags, unsigned res_flags)
+{
+	struct vcm_phys *phys;
+	struct vcm_res *res;
+
+	if (WARN_ON(!vcm || !size || (size & (PAGE_SIZE - 1))))
+		return ERR_PTR(-EINVAL);
+	else if (vcm->driver->alloc || !vcm->driver->map) {
+		int ret;
+
+		__vcm_alloc_and_reserve(vcm, size, &phys, alloc_flags,
+					&res, res_flags);
+
+		if (IS_ERR(res))
+			return res;
+
+		ret = vcm_bind(res, phys);
+		if (!ret)
+			return res;
+
+		if (vcm->driver->unreserve)
+			vcm->driver->unreserve(res);
+		phys->free(phys);
+		return ERR_PTR(ret);
+	} else {
+		__vcm_alloc_and_reserve(vcm, size, &phys, alloc_flags,
+					NULL, 0);
+
+		if (IS_ERR(phys))
+			return ERR_CAST(res);
+
+		res = vcm->driver->map(vcm, phys, res_flags);
+		if (IS_ERR(res))
+			phys->free(phys);
+		return res;
+	}
+}
+EXPORT_SYMBOL_GPL(vcm_make_binding);
+
+struct vcm_phys *__must_check
+vcm_alloc(struct vcm *vcm, resource_size_t size, unsigned flags)
+{
+	struct vcm_phys *phys;
+
+	__vcm_alloc_and_reserve(vcm, size, &phys, flags, NULL, 0);
+
+	return phys;
+}
+EXPORT_SYMBOL_GPL(vcm_alloc);
+
+struct vcm_res *__must_check
+vcm_reserve(struct vcm *vcm, resource_size_t size, unsigned flags)
+{
+	struct vcm_res *res;
+
+	__vcm_alloc_and_reserve(vcm, size, NULL, 0, &res, flags);
+
+	return res;
+}
+EXPORT_SYMBOL_GPL(vcm_reserve);
+
+struct vcm_res *__must_check
+vcm_map(struct vcm *vcm, struct vcm_phys *phys, unsigned flags)
+{
+	struct vcm_res *res;
+	int ret;
+
+	if (WARN_ON(!vcm))
+		return ERR_PTR(-EINVAL);
+
+	if (vcm->driver->map)
+		return vcm->driver->map(vcm, phys, flags);
+
+	res = vcm_reserve(vcm, phys->size, flags);
+	if (IS_ERR(res))
+		return res;
+
+	ret = vcm_bind(res, phys);
+	if (!ret)
+		return res;
+
+	vcm_unreserve(res);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(vcm_map);
+
+void vcm_unreserve(struct vcm_res *res)
+{
+	if (!WARN_ON(!res)) {
+		if (WARN_ON(res->phys))
+			vcm_unbind(res);
+		if (!WARN_ON_ONCE(!res->vcm->driver->unreserve))
+			res->vcm->driver->unreserve(res);
+	}
+}
+EXPORT_SYMBOL_GPL(vcm_unreserve);
+
+void vcm_free(struct vcm_phys *phys)
+{
+	if (!WARN_ON(!phys || atomic_read(&phys->bindings)) && phys->free)
+		phys->free(phys);
+}
+EXPORT_SYMBOL_GPL(vcm_free);
+
+int  __must_check vcm_bind(struct vcm_res *res, struct vcm_phys *phys)
+{
+	int ret;
+
+	if (WARN_ON(!res || !phys))
+		return -EINVAL;
+
+	if (res->phys == phys)
+		return -EALREADY;
+
+	if (res->phys)
+		return -EADDRINUSE;
+
+	if (phys->size > res->res_size)
+		return -ENOSPC;
+
+	if (res->vcm->driver->bind)
+		return -EOPNOTSUPP;
+
+	atomic_inc(&phys->bindings);
+	ret = res->vcm->driver->bind(res, phys);
+	if (ret)
+		res->bound_size = phys->size;
+	else
+		atomic_dec(&phys->bindings);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(vcm_bind);
+
+struct vcm_phys *vcm_unbind(struct vcm_res *res)
+{
+	struct vcm_phys *phys = NULL;
+	if (!WARN_ON(!res || !res->phys)) {
+		phys = res->phys;
+		if (res->vcm->driver->unbind)
+			res->vcm->driver->unbind(res);
+		atomic_dec(&phys->bindings);
+		res->phys = NULL;
+		res->bound_size = 0;
+	}
+	return phys;
+}
+EXPORT_SYMBOL_GPL(vcm_unbind);
+
+void vcm_destroy_binding(struct vcm_res *res)
+{
+	if (!WARN_ON(!res)) {
+		struct vcm_phys *phys = vcm_unbind(res);
+		if (phys)
+			vcm_free(phys);
+		vcm_unreserve(res);
+	}
+}
+EXPORT_SYMBOL_GPL(vcm_destroy_binding);
+
+int  __must_check vcm_activate(struct vcm *vcm)
+{
+	if (WARN_ON(!vcm))
+		return -EINVAL;
+	else if (atomic_inc_return(&vcm->activations) != 1
+	      || !vcm->driver->activate)
+		return 0;
+	else
+		return vcm->driver->activate(vcm);
+}
+EXPORT_SYMBOL_GPL(vcm_activate);
+
+void vcm_deactivate(struct vcm *vcm)
+{
+	if (!WARN_ON(!vcm || !atomic_read(&vcm->activations))
+	 && atomic_dec_and_test(&vcm->activations)
+	 && vcm->driver->deactivate)
+		vcm->driver->deactivate(vcm);
+}
+EXPORT_SYMBOL_GPL(vcm_deactivate);
+
+
+/****************************** VCM Drivers API *****************************/
+
+struct vcm *__must_check vcm_init(struct vcm *vcm)
+{
+	if (WARN_ON(!vcm || !vcm->size
+		 || ((vcm->start | vcm->size) & ~PAGE_MASK)
+		 || !vcm->driver || !vcm->driver->unreserve))
+		return ERR_PTR(-EINVAL);
+
+	atomic_set(&vcm->activations, 0);
+
+	return vcm;
+}
+EXPORT_SYMBOL_GPL(vcm_init);
+
+
+/*************************** Hardware MMU wrapper ***************************/
+
+#ifdef CONFIG_VCM_MMU
+
+struct vcm_mmu_res {
+	struct vcm_res			res;
+	struct list_head		bound;
+};
+
+static void vcm_mmu_cleanup(struct vcm *vcm)
+{
+	struct vcm_mmu *mmu = container_of(vcm, struct vcm_mmu, vcm);
+	WARN_ON(mutex_is_locked(&mmu->mutex));
+	gen_pool_destroy(mmu->pool);
+	if (mmu->driver->cleanup)
+		mmu->driver->cleanup(vcm);
+	else
+		kfree(mmu);
+}
+
+static struct vcm_res *
+vcm_mmu_res(struct vcm *vcm, resource_size_t size, unsigned flags)
+{
+	struct vcm_mmu *mmu = container_of(vcm, struct vcm_mmu, vcm);
+	resource_size_t s, alignment;
+	struct vcm_mmu_res *res;
+	const unsigned char *orders;
+	dma_addr_t addr;
+
+	res = kzalloc(sizeof *res, GFP_KERNEL);
+	if (!res)
+		return ERR_PTR(-ENOMEM);
+
+	/*
+	 * Use the largest alignment that makes sense for given
+	 * reservation size.  For instance, if MMU supports 1M pages
+	 * and reservation is 1M it would be nice to be able to have
+	 * reservation aligned to 1M so that if the physical memory
+	 * will consist of a single 1M block (aligned to 1M) a single
+	 * map entry will suffice.
+	 */
+	s = size >> PAGE_SHIFT;
+	for (orders = mmu->driver->orders; !(s >> *orders); ++orders)
+		/* nop */;
+	alignment = (resource_size_t)1 << (*orders + PAGE_SHIFT);
+
+	/*
+	 * We are allocating a bit more so that if allocation is not
+	 * aligned we can shift inside allocated block to get
+	 * allocation we want.
+	 */
+	s = size + alignment - PAGE_SIZE;
+
+	mutex_lock(&mmu->mutex);
+
+	addr = gen_pool_alloc(mmu->pool, s);
+
+	if (!addr) {
+		kfree(res);
+		res = ERR_PTR(-ENOSPC);
+	} else if (alignment > PAGE_SIZE) {
+		/*
+		 * Align the reservation.  We can safely do this since
+		 * we have allocated more memory then we needed and we
+		 * can move reservation around.
+		 */
+		dma_addr_t start = ALIGN(addr, alignment);
+
+		/* Free unused memory. */
+		if (start != addr)
+			gen_pool_free(mmu->pool, addr, start - addr);
+		if (start + size != addr + s)
+			gen_pool_free(mmu->pool, start + size,
+				      addr + s - (start + size));
+		addr = start;
+	}
+
+	mutex_unlock(&mmu->mutex);
+
+	if (!IS_ERR(res)) {
+		INIT_LIST_HEAD(&res->bound);
+		res->res.start = addr;
+		res->res.res_size = size;
+	}
+
+	return &res->res;
+}
+
+static struct vcm_phys *
+vcm_mmu_phys(struct vcm *vcm, resource_size_t size, unsigned flags)
+{
+	return vcm_phys_alloc(size, flags,
+			      container_of(vcm, struct vcm_mmu,
+					   vcm)->driver->orders);
+}
+
+static int __must_check
+__vcm_mmu_activate(struct vcm_res *res, struct vcm_phys *phys)
+{
+	struct vcm_mmu *mmu = container_of(res->vcm, struct vcm_mmu, vcm);
+	if (mmu->driver->activate)
+		return mmu->driver->activate(res, phys);
+
+	return vcm_phys_walk(res->start, phys, mmu->driver->orders,
+			     mmu->driver->activate_page,
+			     mmu->driver->deactivate_page, res->vcm);
+}
+
+static void __vcm_mmu_deactivate(struct vcm_res *res, struct vcm_phys *phys)
+{
+	struct vcm_mmu *mmu = container_of(res->vcm, struct vcm_mmu, vcm);
+	if (mmu->driver->deactivate)
+		return mmu->driver->deactivate(res, phys);
+
+	vcm_phys_walk(res->start, phys, mmu->driver->orders,
+		      mmu->driver->deactivate_page, NULL, res->vcm);
+}
+
+static int vcm_mmu_bind(struct vcm_res *_res, struct vcm_phys *phys)
+{
+	struct vcm_mmu_res *res = container_of(_res, struct vcm_mmu_res, res);
+	struct vcm_mmu *mmu = container_of(_res->vcm, struct vcm_mmu, vcm);
+	int ret;
+
+	mutex_lock(&mmu->mutex);
+
+	if (mmu->activated) {
+		ret = __vcm_mmu_activate(_res, phys);
+		if (ret)
+			goto done;
+	}
+
+	list_add_tail(&res->bound, &mmu->bound_res);
+	ret = 0;
+
+done:
+	mutex_unlock(&mmu->mutex);
+
+	return ret;
+}
+
+static void vcm_mmu_unbind(struct vcm_res *_res)
+{
+	struct vcm_mmu_res *res = container_of(_res, struct vcm_mmu_res, res);
+	struct vcm_mmu *mmu = container_of(_res->vcm, struct vcm_mmu, vcm);
+
+	mutex_lock(&mmu->mutex);
+
+	if (mmu->activated)
+		__vcm_mmu_deactivate(_res, _res->phys);
+
+	list_del_init(&res->bound);
+
+	mutex_unlock(&mmu->mutex);
+}
+
+static void vcm_mmu_unreserve(struct vcm_res *res)
+{
+	struct vcm_mmu *mmu = container_of(res->vcm, struct vcm_mmu, vcm);
+	mutex_lock(&mmu->mutex);
+	gen_pool_free(mmu->pool, res->start, res->res_size);
+	mutex_unlock(&mmu->mutex);
+}
+
+static int vcm_mmu_activate(struct vcm *vcm)
+{
+	struct vcm_mmu *mmu = container_of(vcm, struct vcm_mmu, vcm);
+	struct vcm_mmu_res *r, *rr;
+	int ret;
+
+	mutex_lock(&mmu->mutex);
+
+	list_for_each_entry(r, &mmu->bound_res, bound) {
+		ret = __vcm_mmu_activate(&r->res, r->res.phys);
+		if (ret < 0)
+			continue;
+
+		list_for_each_entry(rr, &mmu->bound_res, bound) {
+			if (r == rr)
+				goto done;
+			__vcm_mmu_deactivate(&rr->res, rr->res.phys);
+		}
+	}
+
+	mmu->activated = 1;
+	ret = 0;
+
+done:
+	mutex_unlock(&mmu->mutex);
+
+	return ret;
+}
+
+static void vcm_mmu_deactivate(struct vcm *vcm)
+{
+	struct vcm_mmu *mmu = container_of(vcm, struct vcm_mmu, vcm);
+	struct vcm_mmu_res *r;
+
+	mutex_lock(&mmu->mutex);
+
+	mmu->activated = 0;
+
+	list_for_each_entry(r, &mmu->bound_res, bound)
+		mmu->driver->deactivate(&r->res, r->res.phys);
+
+	mutex_unlock(&mmu->mutex);
+}
+
+struct vcm *__must_check vcm_mmu_init(struct vcm_mmu *mmu)
+{
+	static const struct vcm_driver driver = {
+		.cleanup	= vcm_mmu_cleanup,
+		.res		= vcm_mmu_res,
+		.phys		= vcm_mmu_phys,
+		.bind		= vcm_mmu_bind,
+		.unbind		= vcm_mmu_unbind,
+		.unreserve	= vcm_mmu_unreserve,
+		.activate	= vcm_mmu_activate,
+		.deactivate	= vcm_mmu_deactivate,
+	};
+
+	struct vcm *vcm;
+	int ret;
+
+	if (WARN_ON(!mmu || !mmu->driver ||
+		    !(mmu->driver->activate ||
+		      (mmu->driver->activate_page &&
+		       mmu->driver->deactivate_page)) ||
+		    !(mmu->driver->deactivate ||
+		      mmu->driver->deactivate_page)))
+		return ERR_PTR(-EINVAL);
+
+	mmu->vcm.driver = &driver;
+	vcm = vcm_init(&mmu->vcm);
+	if (IS_ERR(vcm))
+		return vcm;
+
+	mmu->pool = gen_pool_create(PAGE_SHIFT, -1);
+	if (!mmu->pool)
+		return ERR_PTR(-ENOMEM);
+
+	ret = gen_pool_add(mmu->pool, mmu->vcm.start, mmu->vcm.size, -1);
+	if (ret) {
+		gen_pool_destroy(mmu->pool);
+		return ERR_PTR(ret);
+	}
+
+	vcm->driver     = &driver;
+	INIT_LIST_HEAD(&mmu->bound_res);
+	mutex_init(&mmu->mutex);
+
+	return &mmu->vcm;
+}
+EXPORT_SYMBOL_GPL(vcm_mmu_init);
+
+#endif
+
+/**************************** One-to-One wrapper ****************************/
+
+#ifdef CONFIG_VCM_O2O
+
+struct vcm_o2o_binding {
+	void			*priv;
+	unsigned long		dead[1];
+	struct vcm_res		res;
+	struct vcm_phys		phys;
+	/* vcm_phys is variable length, don't put anything at the end */
+};
+
+static void vcm_o2o_cleanup(struct vcm *vcm)
+{
+	struct vcm_o2o *o2o = container_of(vcm, struct vcm_o2o, vcm);
+	if (o2o->driver->cleanup)
+		o2o->driver->cleanup(vcm);
+	else
+		kfree(o2o);
+}
+
+static void vcm_o2o_free(struct vcm_phys *phys)
+{
+	struct vcm_o2o_binding *b =
+		container_of(phys, struct vcm_o2o_binding, phys);
+	struct vcm_o2o *o2o =
+		container_of(b->res.vcm, struct vcm_o2o, vcm);
+	o2o->driver->free(phys->parts, b->priv);
+	if (test_and_set_bit(0, b->dead))
+		kfree(b);
+}
+
+static void vcm_o2o_unreserve(struct vcm_res *res)
+{
+	struct vcm_o2o_binding *b =
+		container_of(res, struct vcm_o2o_binding, res);
+	if (test_and_set_bit(0, b->dead))
+		kfree(b);
+}
+
+static struct vcm_phys *
+vcm_o2o_phys(struct vcm *vcm, resource_size_t size, unsigned flags)
+{
+	struct vcm_o2o *o2o = container_of(vcm, struct vcm_o2o, vcm);
+	struct vcm_o2o_binding *b;
+	void *priv;
+
+	b = kmalloc(sizeof *b + sizeof *b->phys.parts, GFP_KERNEL);
+	if (!b)
+		return ERR_PTR(-ENOMEM);
+
+	b->phys.parts->start = 0;
+	b->phys.parts->size  = size;
+	priv = o2o->driver->alloc(vcm, b->phys.parts, flags);
+	if (IS_ERR(priv)) {
+		kfree(b);
+		return ERR_CAST(priv);
+	}
+
+	if (WARN_ON(!b->phys.parts->size ||
+		    (b->phys.parts->start | b->phys.parts->size)
+		  & ~PAGE_MASK)) {
+		o2o->driver->free(b->phys.parts, b->priv);
+		kfree(b);
+		return ERR_PTR(-EINVAL);
+	}
+
+	b->priv		= priv;
+	b->dead[0]	= ~0;
+	b->res.start	= b->phys.parts->start;
+	b->res.res_size	= b->phys.parts->size;
+	b->phys.size	= b->phys.parts->size;
+	b->phys.count	= 1;
+	b->phys.free	= vcm_o2o_free;
+
+	return &b->phys;
+}
+
+static struct vcm_res *
+vcm_o2o_map(struct vcm *vcm, struct vcm_phys *phys, unsigned flags)
+{
+	struct vcm_o2o_binding *b =
+		container_of(phys, struct vcm_o2o_binding, phys);
+
+	if (!test_and_clear_bit(0, b->dead))
+		return ERR_PTR(-EBUSY);
+
+	return &b->res;
+}
+
+static int vcm_o2o_bind(struct vcm_res *res, struct vcm_phys *phys)
+{
+	struct vcm_o2o_binding *b =
+		container_of(res, struct vcm_o2o_binding, res);
+
+	if (&b->phys != phys)
+		return -EOPNOTSUPP;
+
+	if (WARN_ON(test_bit(0, b->dead)))
+		return -EINVAL;
+
+	return 0;
+}
+
+struct vcm *__must_check vcm_o2o_init(struct vcm_o2o *o2o)
+{
+	static const struct vcm_driver driver = {
+		.cleanup	= vcm_o2o_cleanup,
+		.phys		= vcm_o2o_phys,
+		.map		= vcm_o2o_map,
+		.bind		= vcm_o2o_bind,
+		.unreserve	= vcm_o2o_unreserve,
+	};
+
+	if (WARN_ON(!o2o || !o2o->driver ||
+		    !o2o->driver->alloc || !o2o->driver->free))
+		return ERR_PTR(-EINVAL);
+
+	o2o->vcm.driver = &driver;
+	return vcm_init(&o2o->vcm);
+}
+EXPORT_SYMBOL_GPL(vcm_o2o_init);
+
+#endif
+
+/************************ Physical memory management ************************/
+
+#ifdef CONFIG_VCM_PHYS
+
+struct vcm_phys_list {
+	struct vcm_phys_list	*next;
+	unsigned		count;
+	struct vcm_phys_part	parts[31];
+};
+
+static struct vcm_phys_list *__must_check
+vcm_phys_alloc_list_order(struct vcm_phys_list *last, resource_size_t *pages,
+			  unsigned flags, unsigned order, unsigned *total)
+{
+	unsigned count;
+
+	count	= *pages >> order;
+
+	/* So, we need count order-order pages */
+	do {
+		struct page *p = alloc_pages(GFP_DMA, order);
+
+		if (!p)
+			/*
+			 * If allocation failed we may still
+			 * try to continua allocating smaller
+			 * pages.
+			 */
+			break;
+
+		if (last->count == ARRAY_SIZE(last->parts)) {
+			struct vcm_phys_list *l;
+			l = kmalloc(sizeof *l, GFP_KERNEL);
+			if (!l)
+				return NULL;
+
+			l->next = NULL;
+			l->count = 0;
+			last->next = l;
+			last = l;
+		}
+
+		last->parts[last->count].start =
+			page_to_pfn(p) << PAGE_SHIFT;
+		last->parts[last->count].size =
+			(resource_size_t)1 << (order + PAGE_SHIFT);
+		++last->count;
+		++*total;
+		*pages -= 1 << order;
+	} while (--count);
+
+	return last;
+}
+
+static unsigned __must_check
+vcm_phys_alloc_list(struct vcm_phys_list *first,
+		    resource_size_t size, unsigned flags,
+		    const unsigned char *orders)
+{
+	struct vcm_phys_list *last = first;
+	unsigned total_parts = 0;
+	resource_size_t pages;
+
+	/*
+	 * We are trying to allocate as large pages as possible but
+	 * not larger then pages that MMU driver that called us
+	 * supports (ie. the ones provided by page_sizes).  This makes
+	 * it possible to map the region using fewest possible number
+	 * of entries.
+	 */
+	pages = size >> PAGE_SHIFT;
+	do {
+		while (!(pages >> *orders))
+			++orders;
+
+		last = vcm_phys_alloc_list_order(last, &pages, flags, *orders,
+						 &total_parts);
+		if (!last)
+			return 0;
+
+	} while (*orders++ && pages);
+
+	if (pages)
+		return 0;
+
+	return total_parts;
+}
+
+static void vcm_phys_free_parts(struct vcm_phys_part *parts, unsigned count)
+{
+	do {
+		free_pages(parts->start, ffs(parts->size) - 1 - PAGE_SHIFT);
+		++parts;
+	} while (--count);
+}
+
+static void vcm_phys_alloc_cleanup(struct vcm_phys_list *lst)
+{
+	struct vcm_phys_list *first = lst;
+	do {
+		struct vcm_phys_list *l;
+
+		vcm_phys_free_parts(lst->parts, lst->count);
+
+		l = lst->next;
+		if (lst != first)
+			kfree(lst);
+		lst = l;
+	} while (lst);
+}
+
+static void vcm_phys_free(struct vcm_phys *phys)
+{
+	vcm_phys_free_parts(phys->parts, phys->count);
+}
+
+struct vcm_phys *__must_check
+vcm_phys_alloc(resource_size_t size, unsigned flags,
+	       const unsigned char *orders)
+{
+	struct vcm_phys_list first = { NULL, 0 }, *lst;
+	struct vcm_phys_part *out;
+	struct vcm_phys *phys;
+	unsigned count;
+
+	if (WARN_ON((size & (PAGE_SIZE - 1)) || !size || !orders))
+		return ERR_PTR(-EINVAL);
+
+	count = vcm_phys_alloc_list(&first, size, flags, orders);
+	if (!count)
+		goto error;
+
+	phys = kmalloc(sizeof *phys + count * sizeof *phys->parts, GFP_KERNEL);
+	if (!phys)
+		goto error;
+
+	phys->free = vcm_phys_free;
+	phys->count = count;
+	phys->size = size;
+
+	out = phys->parts;
+	lst = &first;
+	do {
+		struct vcm_phys_list *l;
+
+		memcpy(out, lst->parts, lst->count * sizeof *out);
+		out += lst->count;
+
+		l = lst->next;
+		if (lst != &first)
+			kfree(lst);
+		lst = l;
+	} while (lst);
+
+	return phys;
+
+error:
+	vcm_phys_alloc_cleanup(&first);
+	return ERR_PTR(-ENOMEM);
+}
+EXPORT_SYMBOL_GPL(vcm_phys_alloc);
+
+static inline bool is_of_order(dma_addr_t size, unsigned order)
+{
+	return !(size & (((dma_addr_t)PAGE_SIZE << order) - 1));
+}
+
+static int
+__vcm_phys_walk_part(dma_addr_t vaddr, const struct vcm_phys_part *part,
+		     const unsigned char *orders,
+		     int (*callback)(dma_addr_t vaddr, dma_addr_t paddr,
+				     unsigned order, void *priv), void *priv,
+		     unsigned *limit)
+{
+	resource_size_t size = part->size;
+	dma_addr_t paddr = part->start;
+	resource_size_t ps;
+
+	while (!is_of_order(vaddr, *orders))
+		++orders;
+	while (!is_of_order(paddr, *orders))
+		++orders;
+
+	ps = PAGE_SIZE << *orders;
+	for (; *limit && size; --*limit) {
+		int ret;
+
+		while (ps > size)
+			ps = PAGE_SIZE << *++orders;
+
+		ret = callback(vaddr, paddr, *orders, priv);
+		if (ret < 0)
+			return ret;
+
+		ps = PAGE_SIZE << *orders;
+		vaddr += ps;
+		paddr += ps;
+		size  -= ps;
+	}
+
+	return 0;
+}
+
+int vcm_phys_walk(dma_addr_t _vaddr, const struct vcm_phys *phys,
+		  const unsigned char *orders,
+		  int (*callback)(dma_addr_t vaddr, dma_addr_t paddr,
+				  unsigned order, void *arg),
+		  int (*recovery)(dma_addr_t vaddr, dma_addr_t paddr,
+				  unsigned order, void *arg),
+		  void *priv)
+{
+	unsigned limit = ~0;
+	int r = 0;
+
+	if (WARN_ON(!phys || ((_vaddr | phys->size) & (PAGE_SIZE - 1)) ||
+		    !phys->size || !orders || !callback))
+		return -EINVAL;
+
+	for (;;) {
+		const struct vcm_phys_part *part = phys->parts;
+		unsigned count = phys->count;
+		dma_addr_t vaddr = _vaddr;
+		int ret = 0;
+
+		for (; count && limit; --count, ++part) {
+			ret = __vcm_phys_walk_part(vaddr, part, orders,
+						   callback, priv, &limit);
+			if (ret)
+				break;
+
+			vaddr += part->size;
+		}
+
+		if (r)
+			/* We passed error recovery */
+			return r;
+
+		/*
+		 * Either operation suceeded or we were not provided
+		 * with a recovery callback -- return.
+		 */
+		if (!ret || !recovery)
+			return ret;
+
+		/* Switch to recovery */
+		limit = ~0 - limit;
+		callback = recovery;
+		r = ret;
+	}
+}
+EXPORT_SYMBOL_GPL(vcm_phys_walk);
+
+#endif
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 7/9] mm: vcm: Virtual Contiguous Memory framework added
@ 2010-09-06  6:33   ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

This commit adds the Virtual Contiguous Memory framework which
provides an abstraction for virtual address space provided by
various MMUs present on the platform.

The framework uses plugable MMU drivers for hardware MMUs and
if drivers obeys some limitations it can be also used on
platforms with no MMU.

For more information see
<Documentation/virtual-contiguous-memory.txt>.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 Documentation/00-INDEX                      |    2 +
 Documentation/virtual-contiguous-memory.txt |  853 ++++++++++++++++++++++++
 include/linux/vcm-drv.h                     |  299 +++++++++
 include/linux/vcm.h                         |  275 ++++++++
 mm/Kconfig                                  |   30 +
 mm/Makefile                                 |    1 +
 mm/vcm.c                                    |  932 +++++++++++++++++++++++++++
 7 files changed, 2392 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/virtual-contiguous-memory.txt
 create mode 100644 include/linux/vcm-drv.h
 create mode 100644 include/linux/vcm.h
 create mode 100644 mm/vcm.c

diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
index f93e787..7c79ffa 100644
--- a/Documentation/00-INDEX
+++ b/Documentation/00-INDEX
@@ -344,6 +344,8 @@ video-output.txt
 	- sysfs class driver interface to enable/disable a video output device.
 video4linux/
 	- directory with info regarding video/TV/radio cards and linux.
+virtual-contiguous-memory.txt
+	- documentation on virtual contiguous memory manager framework.
 vm/
 	- directory with info on the Linux vm code.
 volatile-considered-harmful.txt
diff --git a/Documentation/virtual-contiguous-memory.txt b/Documentation/virtual-contiguous-memory.txt
new file mode 100644
index 0000000..0c0e90c
--- /dev/null
+++ b/Documentation/virtual-contiguous-memory.txt
@@ -0,0 +1,853 @@
+                                                             -*- org -*-
+
+This document covers how to use the Virtual Contiguous Memory Manager
+(VCMM), how the implementation works, and how to implement MMU drivers
+that can be plugged into VCMM.  It also contains a rationale for VCMM.
+
+* The Virtual Contiguous Memory Manager
+
+The VCMM was built to solve the system-wide memory mapping issues that
+occur when many bus-masters have IOMMUs.
+
+An IOMMU maps device addresses to physical addresses.  It also
+insulates the system from spurious or malicious device bus
+transactions and allows fine-grained mapping attribute control.  The
+Linux kernel core does not contain a generic API to handle IOMMU
+mapped memory; device driver writers must implement device specific
+code to interoperate with the Linux kernel core.  As the number of
+IOMMUs increases, coordinating the many address spaces mapped by all
+discrete IOMMUs becomes difficult without in-kernel support.
+
+The VCMM API enables device independent IOMMU control, virtual memory
+manager (VMM) interoperation and non-IOMMU enabled device
+interoperation by treating devices with or without IOMMUs and all CPUs
+with or without MMUs, their mapping contexts and their mappings using
+common abstractions.  Physical hardware is given a generic device type
+and mapping contexts are abstracted into Virtual Contiguous Memory
+(VCM) regions.  Users "reserve" memory from VCMs and "bind" their
+reservations with physical memory.
+
+If drivers limit their use of VCM contexts to a some subset of VCMM
+functionality, they can work with no changes with or without MMU.
+
+** Why the VCMM is Needed
+
+Driver writers who control devices with IOMMUs must contend with
+device control and memory management.  Driver writers have a large
+device driver API that they can leverage to control their devices, but
+they are lacking a unified API to help them program mappings into
+IOMMUs and share those mappings with other devices and CPUs in the
+system.
+
+Sharing is complicated by Linux's CPU-centric VMM.  The CPU-centric
+model generally makes sense because average hardware only contains
+a MMU for the CPU and possibly a graphics MMU.  If every device in the
+system has one or more MMUs the CPU-centric memory management (MM)
+programming model breaks down.
+
+Abstracting IOMMU device programming into a common API has already
+begun in the Linux kernel.  It was built to abstract the difference
+between AMD and Intel IOMMUs to support x86 virtualization on both
+platforms.  The interface is listed in include/linux/iommu.h.  It
+contains interfaces for mapping and unmapping as well as domain
+management.  This interface has not gained widespread use outside the
+x86; PA-RISC, Alpha and SPARC architectures and ARM and PowerPC
+platforms all use their own mapping modules to control their IOMMUs.
+The VCMM contains an IOMMU programming layer, but since its
+abstraction supports map management independent of device control, the
+layer is not used directly.  This higher-level view enables a new
+kernel service, not just an IOMMU interoperation layer.
+
+** The General Idea: Map Management using Graphs
+
+Looking at mapping from a system-wide perspective reveals a general
+graph problem.  The VCMM's API is built to manage the general mapping
+graph.  Each node that talks to memory, either through an MMU or
+directly (physically mapped) can be thought of as the device-end of
+a mapping edge.  The other edge is the physical memory (or
+intermediate virtual space) that is mapped.  The figure below shows
+an example three with CPU and a few devices connected to the memory
+directly or through a MMU.
+
++--------------------------------------------------------------------+
+|                               Memory                               |
++--------------------------------------------------------------------+
+                                  |
+   +------------------+-----------+-------+----------+-----------+
+   |                  |                   |          |           |
++-----+            +-----+             +-----+  +--------+  +--------+
+| MMU |            | MMU |             | MMU |  | Device |  | Device |
++-----+            +-----+             +-----+  +--------+  +--------+
+   |                  |                   |
++-----+       +-------+---+-----....   +-----+
+| CPU |       |           |            | GPU |
++-----+  +--------+  +--------+        +-----+
+         | Device |  | Device |  ...
+         +--------+  +--------+
+
+For each MMU in the system a VCM context is created through an through
+which drivers can make reservations and bind virtual addresses to
+physical space.  In the direct-mapped case the device is assigned
+a one-to-one MMU (as shown on the figure below). This scheme allows
+direct mapped devices to participate in general graph management.
+
++--------------------------------------------------------------------+
+|                               Memory                               |
++--------------------------------------------------------------------+
+                                  |
+   +------------------+-----------+-------+----------------+
+   |                  |                   |                |
++-----+            +-----+             +-----+      +------------+
+| MMU |            | MMU |             | MMU |      | One-to-One |
++-----+            +-----+             +-----+      +------------+
+   |                  |                   |                |
++-----+       +-------+---+-----....   +-----+       +-----+-----+
+| CPU |       |           |            | GPU |       |           |
++-----+  +--------+  +--------+        +-----+  +--------+  +--------+
+         | Device |  | Device |  ...            | Device |  | Device |
+         +--------+  +--------+                 +--------+  +--------+
+
+The CPU nodes can also be brought under the same mapping abstraction
+with the use of a light overlay on the existing VMM. This light
+overlay allows VCMM-managed mappings to interoperate with the common
+API.  The light overlay enables this without substantial modifications
+to the existing VMM.
+
+In addition to CPU nodes that are running Linux (and the VMM), remote
+CPU nodes that may be running other operating systems can be brought
+into the general abstraction.  Routing all memory management requests
+from a remote node through the central memory management framework
+enables new features like system-wide memory migration.  This feature
+may only be feasible for large buffers that are managed outside of the
+fast-path, but having remote allocation in a system enables features
+that are impossible to build without it.
+
+The fundamental objects that support graph-based map management are:
+Virtual Contiguous Memory contexts, reservations, and physical memory
+allocations.
+
+* Usage Overview
+
+In a nutshell, platform initialises VCM context for each MMU on the
+system and possibly one-to-one VCM contexts which are passed to device
+drivers.  Later on, drivers make reservation of virtual address space
+from the VCM context.  At this point no physical memory has been
+committed to the reservation.  To bind physical memory with a
+reservation, physical memory is allocated (possibly discontiguous) and
+then bound to the reservation.
+
+Single physical allocation can be bound to several different
+reservations also from different VCM contexts.  This allows for
+devices connected through different MMUs (or directly) to the memory
+banks to share physical memory buffers; this also lets it possible to
+map such memory into CPU's address space (be it kernel or user space)
+so that the same data can be accessed by the CPU.
+
+[[file:../include/linux/vcm.h][include/linux/vcm.h]] includes comments documenting each API.
+
+** Virtual Contiguous Memory context
+
+A Virtual Contiguous Memory context (VCM) abstracts an address space
+a device sees.  A VCM is created with a VCM driver dependent call.  It
+is destroyed with a call to:
+
+        void vcm_destroy(struct vcm *vcm);
+
+The newly created VCM instance can be passed to any function that needs to
+operate on or with a virtual contiguous memory region.  All internals
+of the VCM driver and how the mappings are handled is hidden and VCM
+driver dependent.
+
+** Bindings
+
+If all that driver needs is allocate some physical space and map it
+into its address space, a vcm_make_binding() call can be used:
+
+	struct vcm_res	*__must_check
+	vcm_make_binding(struct vcm *vcm, resource_size_t size,
+			 unsigned alloc_flags, unsigned res_flags);
+
+This call allocates physical memory, reserves virtual address space
+and binds those together.  If all those succeeds a reservation is
+returned which has physical memory associated with it.
+
+If driver does not require more complicated VCMM functionality, it is
+desirable to use this function since it will work on both real MMUs
+and one-to-one mappings.
+
+To destroy created binding, vcm_destroy_binding() can be used:
+
+        void vcm_destroy_binding(struct vcm_res *res);
+
+** Physical memory
+
+Physical memory allocations are handled using the following functions:
+
+	struct vcm_phys *__must_check
+	vcm_alloc(struct vcm *vcm, resource_size_t size, unsigned flags);
+
+	void vcm_free(struct vcm_phys *phys);
+
+It is noteworthy that physical space allocation is done in the context
+of a VCM.  This is especially important in case of one-to-one VCM
+contexts which cannot handle discontiguous physical memory.
+
+Also, depending on VCM context, the physical space may be allocated in
+parts of different sizes.  For instance, if a given MMU supports
+16MiB, 1MiB, 64KiB and 4KiB pages, it is likely that vcm_alloc() in
+context of this MMU's driver will try to split into as few as possible
+parts of those sizes.
+
+In case of one-to-one VCM contexts, a physical memory allocated with
+the call to vcm_alloc() may be usable only with vcm_map() function.
+
+** Mappings
+
+The easiest way to map a physical space into virtual address space
+represented by VCM context is to use the vcm_map() function:
+
+	struct vcm_res *__must_check
+	vcm_map(struct vcm *vcm, struct vcm_phys *phys, unsigned flags);
+
+This functions reserves address space from VCM context and binds
+physical space to it.  To reverse the process vcm_unmap() can be used:
+
+	void vcm_unmap(struct vcm_res *res);
+
+Similarly to vcm_make_binding(), Usage vcm_map() may be advantageous
+over the use of vcm_reserve() followed by vcm_bind().  This is not
+only true for one-to-one mapping but if it so happens that the call to
+vcm_map() request mapping of a physically contiguous space into kernel
+space, a direct mapping can be returned instead of creating a new one.
+
+In some cases, a reservation created with vcm_map() can be used only
+with the physical memory passed as the argument to vcm_map() (so if
+user chooses to call vcm_unbind() and then vcm_bind() on a different
+physical memory, the call may fail).
+
+** Reservations
+
+A reservation is a contiguous region allocated from a virtual address
+space represented by VCM context.  Just after reservation is created,
+no physical memory needs to be is bound to it.  To manage reservations
+following two functions are provided:
+
+	struct vcm_res *__must_check
+	vcm_reserve(struct vcm *vcm, resource_size_t size,
+		    unsigned flags);
+
+	void vcm_unreserve(struct vcm_res *res);
+
+The first one creates a reservation of desired size, and the second
+one destroys it.
+
+** Binding memory
+
+To bind a physical memory into a reservation vcm_bind() function is
+used:
+
+	int __must_check vcm_bind(struct vcm_res *res,
+				  struct vcm_phys *phys);
+
+When the binding is no longer needed, vcm_unbind() destroys the
+connection:
+
+	struct vcm_phys *vcm_unbind(struct vcm_res *res);
+
+** Activating mappings
+
+Unless a VCM context is activated, none of the bindings are actually
+guaranteed to be available.  When device driver needs the mappings
+it need to call vcm_activate() function to guarantee that the mappings
+are sent to hardware MMU.
+
+	int  __must_check vcm_activate(struct vcm *vcm);
+
+After VCM context is activated all further bindings (made with
+vcm_make_binding(), vcm_map() or vcm_bind()) will be updated so there
+is no need to call vcm_activate() after each binding is done or
+undone.
+
+To deactivate the VCM context vcm_deactivate() function is used:
+
+	void vcm_deactivate(struct vcm *vcm);
+
+Both of those functions can be called several times if all calls to
+vcm_activate() are paired with a later call to vcm_deactivate().
+
+** Device driver example
+
+The following is a simple, untested example of how platform and
+devices work together to use the VCM framework.  Platform initialises
+contexts for each MMU in the systems, and through platform device data
+passes them to correct drivers.
+
+Device driver header file:
+
+	struct foo_platform_data {
+		/* ... */
+		struct vcm	*vcm;
+		/* ... */
+	};
+
+Platform code:
+
+	static int plat_bar_vcm_init(void)
+	{
+		struct foo_platform_data *fpdata;
+		struct vcm *vcm;
+
+		vcm = vcm_baz_create(...);
+		if (IS_ERR(vcm))
+			return PTR_ERR(vcm);
+
+		fpdata = dev_get_platdata(&foo_device.dev);
+		fpdata->vcm = vcm;
+
+		/* ... */
+
+		return 0;
+	}
+
+Device driver implementation:
+
+	struct foo_private {
+		/* ... */
+		struct vcm_res	*fw;
+		/* ... */
+	};
+
+	static inline struct vcm_res *__must_check
+	__foo_alloc(struct device *dev, size_t size)
+	{
+		struct foo_platform_data *pdata =
+			dev_get_platdata(dev);
+		return vcm_make_binding(pdata->vcm, size, 0, 0);
+	}
+
+	static inline void __foo_free(struct vcm_res *res)
+	{
+		vcm_destroy_binding(res);
+	}
+
+	static int foo_probe(struct device *dev)
+	{
+		struct foo_platform_data *pdata =
+			dev_get_platdata(dev);
+		struct foo_private *priv;
+
+		if (IS_ERR_OR_NULL(pdata->vcm))
+			return pdata->vcm ? PTR_ERR(pdata->vcm) : -EINVAL;
+
+		priv = kzalloc(sizeof *priv, GFP_KERNEL);
+		if (!priv)
+			return -ENOMEM;
+
+		/* ... */
+
+		priv->fw = __foo_alloc(dev, 1 << 20);
+		if (IS_ERR(priv->fw)) {
+			kfree(priv);
+			return PTR_ERR(priv->fw);
+		}
+		/* copy firmware to fw */
+
+		vcm_activate(pdata->vcm);
+
+		dev->p = priv;
+
+		return 0;
+	}
+
+	static int foo_remove(struct device *dev)
+	{
+		struct foo_platform_data *pdata =
+			dev_get_platdata(dev);
+		struct foo_private *priv = dev->p;
+
+		/* ... */
+
+		vcm_deactivate(pdata->vcm);
+		__foo_free(priv->fw);
+
+		kfree(priv);
+
+		return 0;
+	}
+
+	static int foo_do_something(struct device *dev, /* ... */)
+	{
+		struct foo_platform_data *pdata =
+			dev_get_platdata(dev);
+		struct vcm_res *buf;
+		int ret;
+
+		buf = __foo_alloc(/* ... size ...*/);
+		if (IS_ERR(buf))
+			return ERR_PTR(buf);
+
+		/*
+		 * buf->start is address visible from device's
+		 * perspective.
+		 */
+
+		/* ... set hardware up ... */
+
+		/* ... wait for completion ... */
+
+		__foo_free(buf);
+
+		return ret;
+	}
+
+In the above example only vcm_make_binding() function is used so that
+the above scheme will work not only for systems with MMU but also in
+case of one-to-one VCM context.
+
+** IOMMU, one-to-one and VMM contexts
+
+The following example demonstrates mapping IOMMU, one-to-one and VMM
+reservations to the same physical memory.  For readability, error
+handling is not shown on the listings.
+
+First, each contexts needs to be created.  A call used for creating
+context is dependent on the driver used.  The following is just an
+example of how this could look like:
+
+	struct vcm *vcm_vmm, *vcm_onetoone, *vcm_iommu;
+
+	vcm_vmm      = vcm_vmm_create();
+	vcm_onetoone = vcm_onetoone_create();
+	vcm_iommu    = vcm_foo_mmu_create();
+
+Once contexts are created, physical space needs to be allocated,
+reservations made on each context and physical memory mapped to those
+reservations.  Because there is a one-to-one context, the memory has
+to be allocated from its context.  It's also best to map the memory in
+the single call using vcm_make_binding():
+
+	struct vcm_res *res_onetoone;
+
+	res_onetoone = vcm_make_binding(vcm_o2o, SZ_2MB | SZ_4K, 0, 0);
+
+What's left is map the space in the other two contexts.  If the
+reservation in the other two contexts won't be used for any other
+purpose then to reference the memory allocated in above, it's best to
+use vcm_map():
+
+	struct vcm_res *res_vcm, *res_iommu;
+
+	res_vmm = vcm_map(vcm_vmm, res_onetoone->phys, 0);
+	res_iommu = vcm_map(vcm_iommu, res_onetoone->phys, 0);
+
+Once the bindings have been created, the contexts need to be activated
+to make sure that they are actually on the hardware. (In case of
+one-to-one mapping it's most likely a no-operation but it's still
+required by the VCMM API so it must not be omitted.)
+
+	vcm_activate(vcm_vmm);
+	vcm_activate(vcm_onetoone);
+	vcm_activate(vcm_iommu);
+
+At this point, all three reservations represent addresses in
+respective address space that is bound to a physical memory.  Not only
+CPU can access it now but also devices connected through the MMU, as
+well as devices connected directly to the memory banks.  The bus
+address for the devices and virtual address for the CPU is available
+through the 'start' member of the vcm_res structure (ie. res_* objects
+above).
+
+Once the mapping is no longer used and memory no longer needed it can
+be freed as follows:
+
+	vcm_unmap(res_vmm);
+	vcm_unmap(res_iommu);
+	vcm_destroy_binding(res_onetoone);
+
+If the contexts are not needed either, they can be disabled:
+
+	vcm_deactivate(vcm_vmm);
+	vcm_deactivate(vcm_iommu);
+	vcm_deactivate(vcm_onetoone);
+
+and than, even destroyed:
+
+	vcm_destroy(vcm_vmm);
+	vcm_destroy(vcm_iommu);
+	vcm_destroy(vcm_onetoone);
+
+* Available drivers
+
+The following VCM drivers are provided:
+
+** Real hardware drivers
+
+There are no real hardware drivers at this time.
+
+** One-to-One drivers
+
+As it has been noted, one-to-One drivers are limited in the sense that
+certain operations are very unlikely to succeed.  In fact, it is often
+certain that some operations will fail.  If your driver needs to be
+able to run with One-to-One driver you should limit operations to:
+
+	vcm_make_binding()
+	vcm_destroy_binding()
+
+under some conditions, vcm_map() may also work.
+
+There are no One-to-One drivers at this time.
+
+* Writing a VCM driver
+
+The core of VCMM does not handle communication with the MMU.  For this
+purpose a VCM driver is used.  Its purpose is to manage virtual
+address space reservations, physical allocations as well as updating
+mappings in the hardware MMU.
+
+API designed for VCM drivers is described in the
+[[file:../include/linux/vcm-drv.h][include/linux/vcm-drv.h]] file so it might be a good idea to take a look
+inside.
+
+VCMM provides API for three different kinds of drivers.  The most
+basic is a core VCM which VCMM use directly.  Other then that, VCMM
+provides two wrappers -- VCM MMU and VCM One-to-One -- which can be
+used to create drivers for real hardware VCM contexts and for
+One-to-One contexts.
+
+All of the drivers need to provide a context creation functions which
+will allocate memory, fill start address, size and pointer to driver
+operations, and then call an init function which fills rest of the
+fields and validates entered values.
+
+** Writing a core VCM driver
+
+The core driver needs to provide a context creation function as well
+as at least some of the following operations:
+
+	void (*cleanup)(struct vcm *vcm);
+
+	int (*alloc)(struct vcm *vcm, resource_size_t size,
+		     struct vcm_phys **phys, unsigned alloc_flags,
+		     struct vcm_res **res, unsigned res_flags);
+	struct vcm_res *(*res)(struct vcm *vcm, resource_size_t size,
+			       unsigned flags);
+	struct vcm_phys *(*phys)(struct vcm *vcm, resource_size_t size,
+				 unsigned flags);
+
+	void (*unreserve)(struct vcm_res *res);
+
+	struct vcm_res *(*map)(struct vcm *vcm, struct vcm_phys *phys,
+			       unsigned flags);
+	int (*bind)(struct vcm_res *res, struct vcm_phys *phys);
+	void (*unbind)(struct vcm_res *res);
+
+	int (*activate)(struct vcm *vcm);
+	void (*deactivate)(struct vcm *vcm);
+
+All of the operations (expect for the alloc) may assume that all
+pointer arguments are not-NULL.  (In case of alloc, if any argument is
+NULL it is either phys or res (never both).)
+
+*** Context creation
+
+To use a VCM driver a VCM context has to be provided which is bound to
+the driver.  This is done by a driver-dependent call defined in it's
+header file.  Such a call may take varyous arguments to configure the
+context of the MMU.  Its prototype may look as follows:
+
+	struct vcm *__must_check vcm_samp_create(/* ... */);
+
+The driver will most likely define a structure encapsulating the vcm
+structure (in the usual way).  The context creation function must
+allocate space for such a structure and initialise it correctly
+including all members of the vcm structure expect for activations.
+The activations member is initialised by calling:
+
+	struct vcm *__must_check vcm_init(struct vcm *vcm);
+
+This function also validates that all fields are set correctly.
+
+The driver field of the vcm structure must point to a structure with
+all operations supported by the driver.
+
+If everything succeeds, the function has to return pointer to the vcm
+structure inside the encapsulating structure.  It is the pointer that
+will be passed to all of the driver's operations.  On error,
+a pointer-error must be returned (ie. not NULL).
+
+The function might look something like the following:
+
+	struct vcm *__must_check vcm_foo_create(/* ... */)
+	{
+		struct vcm_foo *foo;
+		struct vcm *vcm;
+
+		foo = kzalloc(sizeof *foo, GFP_KERNEL);
+		if (!foo)
+			return ERR_PTR(-ENOMEM);
+
+		/* ... do stuff ... */
+
+		foo->vcm.start  = /* ... */;
+		foo->vcm.size   = /* ... */;
+		foo->vcm.driver = &vcm_foo_driver;
+
+		vcm = vcm_init(&foo->vcm);
+		if (IS_ERR(vcm)) {
+			/* ... error recovery ... */
+			kfree(foo);
+		}
+		return vcm;
+	}
+
+*** Cleaning up
+
+The cleanup operation is called when the VCM context is destroyed.
+Its purpose is to free all resources acquired when VCM context was
+created including the space for the context structure.  If it is not
+given, the memory is freed using the kfree() function.
+
+*** Allocation and reservations
+
+If alloc operation is specified, res and phys operations are ignored.
+The observable behaviour of the alloc operation should mimic as
+closely as possible res and phys operations called one after the
+other.
+
+The reason for this operation is that in case of one-to-one VCM
+contexts, the driver may not be able to bind together arbitrary
+reservation with an arbitrary physical space.  In one-to-one contexts,
+reservations and physical memory are tight together and need to be
+made at the same time to make binding possible.
+
+The alloc operation may be called with both, res and phys being set,
+or at most one of them being NULL.
+
+The res operation reserves virtual address space in the VCM context.
+The function must set the start and res_size members of the vcm_res
+structure -- all other fields are filled by the VCMM framework.
+
+The phys operation allocates physical space which can later be bound
+to the reservation.  Unless VCM driver needs some special handling of
+physical memory, the vcm_phys_alloc() function can be used:
+
+	struct vcm_phys *__must_check
+	vcm_phys_alloc(resource_size_t size, unsigned flags,
+		       const unsigned char *orders);
+
+The last argument of this function (orders) is an array of orders of
+page sizes that function should try to allocate.  This array must be
+sorted from highest order to lowest and the last entry must be zero.
+
+For instance, an array { 8, 4, 0 } means that the function should try
+and allocate 1MiB, 64KiB and 4KiB pages (this is assuming PAGE_SIZE is
+4KiB which is true for all supported architectures).  For example, if
+requested size is 2MiB and 68 KiB, the function will try to allocate
+two 1MiB pages, one 64KiB page and one 4KiB page.  This may be useful
+when the mapping is written to the MMU since the largest possible
+pages will be used reducing the number of entries.
+
+If phys or alloc callback chooses to allocate physical memory on its
+own, it must provide a free callback along with the vcm_phys
+structure.  The purpose of the callback is, as one may imagine, to
+free allocated space.
+
+All those operations may assume that size is a non-zero and divisible
+by PAGE_SIZE.
+
+*** Binding
+
+The map operation is optional and it joins res and bind operations
+together.  Like alloc operation, this is provided because in case of
+one-to-one mappings, the VCM driver may be unable to bind together
+physical space with an arbitrary reservation.
+
+Moreover, in case of some VCM drivers, a mapping for given physical
+memory can already be present (ie. in case of using VMM).
+
+Reservation created with map operation does not have to be usable
+with any other physical space then the one provided when reservation
+was created.
+
+The bind operation binds given reservation with a given physical
+memory.  The operation may assume that reservation given as an
+argument is not bound to any physical memory.
+
+Whichever of the two operation is used, the binding must be reflected
+on the hardware if the VCM context has been activated.  If VCM context
+has not been activated this is not required.
+
+The vcm_map() function uses map operation if one is provided.
+Otherwise, it falls back to alloc or res operation followed by bind
+operation.  If this is also not possible, -EOPNOTSUPP is returned.
+Similarly, vcm_bind() function uses the bind operation unless it is
+not provided in which case -EOPNOTSUPP is returned.
+
+Also, if alloc operation is not provided but map is, the
+vcm_make_binding() function will use phys and map operations.
+
+*** Freeing resources
+
+The unbind callback removes the binding between reservation and
+a physical memory.  If unbind operation is not provided, VCMM assumes
+that it is a no-operation.
+
+The unreserve callback releases a reservation as well as free
+allocated space for the vcm_res structure.  It is required and if it
+is not provided vcm_unreserve() will generate a warning.
+
+*** Activation
+
+When VCM context is activated, the activate callback is called.  It is
+called only once even if vcm_activate() is called several times on the
+same context.
+
+When VCM context is deactivated (that is, if for each call to
+vcm_activate(), vcm_deactivate() was called) the deactivate callback
+is called.
+
+When VCM context is activated, all bound reservations must be
+reflected on the hardware MMU (if any).  Also, ofter activation, all
+calls to vcm_bind(), vcm_map() or vcm_make_binding() must
+automatically reflect new mappings on the hardware MMU.
+
+Neither of the operations are required and if missing, VCMM will
+assume they are a no-operation and no warning will be generated.
+
+** Writing a hardware MMU driver
+
+It may be undesirable to implement all of the operations that are
+required to create a usable driver.  In case of hardware MMUs a helper
+wrapper driver has been created to make writing real drivers as simple
+as possible.
+
+The wrapper implements most of the functionality of the driver leaving
+only implementation of the actual talking to the hardware MMU in hands
+of programmer.  Reservations managements as general housekeeping is
+already there.
+
+If you want to use this wrapper, you need to select VCM_MMU Kconfig
+option.
+
+*** Context creation
+
+Similarly to normal drivers, MMU driver needs to provide a context
+creation function.  Such a function must provide a vcm_mmu object and
+initialise vcm.start, vcm.size and driver fields of the structure.
+When this is done, vcm_mmu_init() should be called which will
+initialise the rest of the fields and validate entered values:
+
+	struct vcm *__must_check vcm_mmu_init(struct vcm_mmu *mmu);
+
+This is, in fact, very similar to the way standard driver is created.
+
+*** Orders
+
+One of the fields of the vcm_mmu_driver structure is orders.  This is
+an array of orders of pages supported by the hardware MMU.  It must be
+sorted from largest to smallest and zero terminated.
+
+The order is the logarithm with the base two of the size of supported
+page size divided by PAGE_SIZE.  For instance, { 8, 4, 0 } means that
+MMU supports 1MiB, 64KiB and 4KiB pages.
+
+*** Operations
+
+The three operations that MMU wrapper driver uses are:
+
+	void (*cleanup)(struct vcm *vcm);
+
+	int (*activate)(struct vcm_res *res, struct vcm_phys *phys);
+	void (*deactivate)(struct vcm_res *res, struct vcm_phys *phys);
+
+	int (*activate_page)(dma_addr_t vaddr, dma_addr_t paddr,
+			     unsigned order, void *vcm),
+	int (*deactivate_page)(dma_addr_t vaddr, dma_addr_t paddr,
+			       unsigned order, void *vcm),
+
+The first one frees all resources allocated by the context creation
+function (including the structure itself).  If this operation is not
+given, kfree() will be called on vcm_mmu structure.
+
+The activate and deactivate operations are required and they are used
+to update mappings in the MMU.  Whenever binding is activated or
+deactivated the respective operation is called.
+
+To divide mapping into physical pages, vcm_phys_walk() function can be
+used:
+
+	int vcm_phys_walk(dma_addr_t vaddr, const struct vcm_phys *phys,
+			  const unsigned char *orders,
+			  int (*callback)(dma_addr_t vaddr, dma_addr_t paddr,
+					  unsigned order, void *priv),
+			  int (*recovery)(dma_addr_t vaddr, dma_addr_t paddr,
+					  unsigned order, void *priv),
+			  void *priv);
+
+It start from given virtual address and tries to divide allocated
+physical memory to as few pages as possible where order of each page
+is one of the orders specified by orders argument.
+
+It may be easier to implement activate_page and deactivate_page
+operations instead thought.  They are called on each individual page
+rather then the whole mapping.  It basically incorporates call to the
+vcm_phys_walk() function so driver does not need to call it
+explicitly.
+
+** Writing a one-to-one VCM driver
+
+
+
+Similarly to a wrapper for a real hardware MMU a wrapper for
+one-to-one VCM contexts has been created.  It implements all of the
+houskeeping operations and leaves only contiguous memory management
+(that is allocating and freeing contiguous regions).
+
+*** Context creation
+
+As with other drivers, one-to-one driver needs to provide a context
+creation function.  It needs to allocate space for vcm_o2o structure
+and initialise its vcm.start, vcm.end and driver fields.  Calling
+vcm_o2o_init() will fill the other fields and validate entered values:
+
+	struct vcm *__must_check vcm_o2o_init(struct vcm_o2o *o2o);
+
+There are the following three operations used by the wrapper:
+
+	void (*cleanup)(struct vcm *vcm);
+	void *(*alloc)(struct vcm *vcm, struct vcm_phys_part *part,
+		       unsigned flags);
+	void (*free)(struct vcm_phys_part *part, void *priv);
+
+The cleanup operation cleans the context and frees all resources.  If
+not provided, kfree() is used.
+
+The alloc operation allocates physically contiguous memory.  The size
+of requested block is saved in the provided vcm_phys_part structure.
+Inside this structure, the operation must save the physical address of
+the allocated block if allocation succeeds.  On error, the operation
+must return an error-pointer (NULL is treated as success).
+
+The free operation frees the physically contiguous memory that has
+been allocated with alloc.  As arguments, it is given the same
+vcm_phy_part structure that alloc initialised as well as a priv
+argument which is the value returned by alloc operation.
+
+Both, alloc and free, operations are required.
+
+If you want to use this wrapper, you need to select VCM_O2O Kconfig
+option.
+
+* Epilogue
+
+The initial version of the VCMM framework was written by Zach Pfeffer
+<zpfeffer@codeaurora.org>.  It was then redesigned and mostly
+rewritten by Michal Nazarewicz <m.nazarewicz@samsung.com>.
+
+The new version is still lacking a few important features.  Most
+notably, no real hardware MMU has been implemented yet.  This may be
+ported from original Zach's proposal.
+
+Also, support for VMM is lacking.  This is another thing that can be
+ported from Zach's proposal.
diff --git a/include/linux/vcm-drv.h b/include/linux/vcm-drv.h
new file mode 100644
index 0000000..8c1862e
--- /dev/null
+++ b/include/linux/vcm-drv.h
@@ -0,0 +1,299 @@
+/*
+ * Virtual Contiguous Memory driver API header
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#ifndef __LINUX_VCM_DRV_H
+#define __LINUX_VCM_DRV_H
+
+#include <linux/vcm.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
+
+#include <asm/atomic.h>
+
+/**
+ * struct vcm_driver - the MMU driver operations.
+ * @cleanup:	called when vcm objects is destroyed; if omitted,
+ *		kfree() will be used.
+ * @alloc:	callback function for allocating physical memory and
+ *		reserving virtual address space; XXX FIXME: document;
+ *		if set, @res and @alloc are ignored.
+ * @res:	creates a reservation of virtual address space; XXX FIXME:
+ *		document; if @alloc is provided this is ignored.
+ * @res:	allocates a physical memory; XXX FIXME: document; if @alloc
+ *		is provided this is ignored.
+ * @unreserve:	destroys a virtual address space reservation created by @alloc;
+ *		required.
+ * @map:	reserves address space and binds a physical memory to it.
+ * @bind:	binds a physical memory to a reserved address space.
+ * @unbind:	unbinds a physical memory from reserved address space.
+ * @activate:	activates the context making all bindings active; once
+ *		the context has been activated, this callback is not
+ *		called again until context is deactivated and
+ *		activated again (so if user calls vcm_activate()
+ *		several times only the first call in sequence will
+ *		invoke this callback).
+ * @deactivate:	deactivates the context making all bindings inactive;
+ *		call this callback always accompanies call to the
+ *		@activate callback.
+ */
+struct vcm_driver {
+	void (*cleanup)(struct vcm *vcm);
+
+	int (*alloc)(struct vcm *vcm, resource_size_t size,
+		     struct vcm_phys **phys, unsigned alloc_flags,
+		     struct vcm_res **res, unsigned res_flags);
+	struct vcm_res *(*res)(struct vcm *vcm, resource_size_t size,
+			       unsigned flags);
+	struct vcm_phys *(*phys)(struct vcm *vcm, resource_size_t size,
+				 unsigned flags);
+
+	void (*unreserve)(struct vcm_res *res);
+
+	struct vcm_res *(*map)(struct vcm *vcm, struct vcm_phys *phys,
+			       unsigned flags);
+	int (*bind)(struct vcm_res *res, struct vcm_phys *phys);
+	void (*unbind)(struct vcm_res *res);
+
+	int (*activate)(struct vcm *vcm);
+	void (*deactivate)(struct vcm *vcm);
+};
+
+/**
+ * struct vcm_phys - representation of allocated physical memory.
+ * @count:	number of contiguous parts the memory consists of; if this
+ *		equals one the whole memory block is physically contiguous;
+ *		read only.
+ * @size:	total size of the allocated memory; read only.
+ * @free:	callback function called when memory is freed; internal.
+ * @bindings:	how many virtual address space reservations this memory has
+ *		been bound to; internal.
+ * @parts:	array of @count parts describing each physically contiguous
+ *		memory block that the whole area consists of; each element
+ *		describes part's physical starting address in bytes
+ *		(@parts->start) and its size in bytes (@parts->size); read
+ *		only.
+ */
+struct vcm_phys {
+	unsigned		count;
+	resource_size_t		size;
+
+	void (*free)(struct vcm_phys *phys);
+	atomic_t		bindings;
+
+	struct vcm_phys_part {
+		dma_addr_t	start;
+		resource_size_t	size;
+	} parts[0];
+};
+
+/**
+ * vcm_init() - initialises VCM context structure.
+ * @vcm:	the VCM context to initialise.
+ *
+ * This function initialises the vcm structure created by a MMU driver
+ * when setting things up.  It sets up all fields of the vcm structure
+ * expect for @vcm->start, @vcm->size and @vcm->driver which are
+ * validated by this function.  If they have invalid value function
+ * produces warning and returns an error-pointer.  If everything is
+ * fine, @vcm is returned.
+ */
+struct vcm *__must_check vcm_init(struct vcm *vcm);
+
+#ifdef CONFIG_VCM_MMU
+
+struct vcm_mmu;
+
+/**
+ * struct vcm_mmu_driver - a driver used for real MMUs.
+ * @orders:	array of orders of pages supported by the MMU sorted from
+ *		the largest to the smallest.  The last element is always
+ *		zero (which means 4K page).
+ * @cleanup:	Function called when the VCM context is destroyed;
+ *		optional, if not provided, kfree() is used.
+ * @activate:	callback function for activating a single mapping; it's
+ *		role is to set up the MMU so that reserved address space
+ *		donated by res will point to physical memory donated by
+ *		phys; required unless @activate_page and @deactivate_page
+ *		are both provided
+ * @deactivate:	this reverses the effect of @activate; required unless
+ *		@deactivate_page is provided.
+ * @activate_page:	callback function for activating a single page; it is
+ *			ignored if @activate is provided; it's given a single
+ *			page such that its order (given as third argument) is
+ *			one of the supported orders specified in @orders;
+ *			required unless @activate is provided.
+ * @deactivate_page:	this reverses the effect of the @activate_page
+ *			callback; required unless @activate and @deactivate
+ *			are both provided.
+ */
+struct vcm_mmu_driver {
+	const unsigned char	*orders;
+
+	void (*cleanup)(struct vcm *vcm);
+	int (*activate)(struct vcm_res *res, struct vcm_phys *phys);
+	void (*deactivate)(struct vcm_res *res, struct vcm_phys *phys);
+	int (*activate_page)(dma_addr_t vaddr, dma_addr_t paddr,
+			     unsigned order, void *vcm);
+	int (*deactivate_page)(dma_addr_t vaddr, dma_addr_t paddr,
+			       unsigned order, void *vcm);
+};
+
+/**
+ * struct vcm_mmu - VCM MMU context
+ * @vcm:	VCM context.
+ * @driver:	VCM MMU driver's operations.
+ * @pool:	virtual address space allocator; internal.
+ * @bound_res:	list of bound reservations; internal.
+ * @mutex:	mutext protecting @bound_res; internal.
+ * @activated:	whether VCM context has been activated; internal.
+ */
+struct vcm_mmu {
+	struct vcm			vcm;
+	const struct vcm_mmu_driver	*driver;
+	/* internal */
+	struct gen_pool			*pool;
+	struct list_head		bound_res;
+	/* The mutex protects operations on bound_res list and list. */
+	struct mutex			mutex;
+	int				activated;
+};
+
+/**
+ * vcm_mmu_init() - initialises a VCM context for a real MMU.
+ * @mmu:	the vcm_mmu context to initialise.
+ *
+ * This function initialises the vcm_mmu structure created by a MMU
+ * driver when setting things up.  It sets up all fields of the
+ * structure expect for @mmu->vcm.start, @mmu.vcm->size and
+ * @mmu->driver which are validated by this function.  If they have
+ * invalid value function produces warning and returns an
+ * error-pointer.  On any other error, an error-pointer is returned as
+ * well.  If everything is fine, address of @mmu->vcm is returned.
+ */
+struct vcm *__must_check vcm_mmu_init(struct vcm_mmu *mmu);
+
+#endif
+
+#ifdef CONFIG_VCM_O2O
+
+/**
+ * struct vcm_o2o_driver - VCM One-to-One driver
+ * @cleanup:	cleans up the VCM context; if not specified. kfree() is used.
+ * @alloc:	physically contiguous memory allocator; the size of the
+ *		block to allocate is specified by part->size; the physical
+ *		address of the block must be returned in part->start;
+ *		on error must return an error-pointer, otherwise some
+ *		other pointer which will be passed to @free as priv;
+ *		required.
+ * @free:	physical memory freeing function; required.
+ */
+struct vcm_o2o_driver {
+	void (*cleanup)(struct vcm *vcm);
+	void *(*alloc)(struct vcm *vcm, struct vcm_phys_part *part,
+		       unsigned flags);
+	void (*free)(struct vcm_phys_part *part, void *priv);
+};
+
+/**
+ * struct vcm_o2o - VCM One-to-One context
+ * @vcm:	VCM context.
+ * @driver:	VCM One-to-One driver's operations.
+ */
+struct vcm_o2o {
+	struct vcm			vcm;
+	const struct vcm_o2o_driver	*driver;
+};
+
+/**
+ * vcm_mmu_init() - initialises a VCM context for a one-to-one context.
+ * @o2o:	the vcm_o2o context to initialise.
+ *
+ * This function initialises the vcm_o2o structure created by a O2O
+ * driver when setting things up.  It sets up all fields of the
+ * structure expect for @o2o->vcm.start, @o2o->vcm.size and
+ * @o2o->driver which are validated by this function.  If they have
+ * invalid value function produces warning and returns an
+ * error-pointer.  On any other error, an error-pointer is returned as
+ * well.  If everything is fine, address of @o2o->vcm is returned.
+ */
+struct vcm *__must_check vcm_o2o_init(struct vcm_o2o *o2o);
+
+#endif
+
+#ifdef CONFIG_VCM_PHYS
+
+/**
+ * vcm_phys_alloc() - allocates physical discontiguous space
+ * @size:	size of the block to allocate.
+ * @flags:	additional allocation flags; XXX FIXME: document
+ * @orders:	array of orders of pages supported by the MMU sorted from
+ *		the largest to the smallest.  The last element is always
+ *		zero (which means 4K page).
+ *
+ * This function tries to allocate a physical discontiguous space in
+ * such a way that it allocates the largest possible blocks from the
+ * sizes donated by the @orders array.  So if @orders is { 8, 0 }
+ * (which means 1MiB and 4KiB pages are to be used) and requested
+ * @size is 2MiB and 12KiB the function will try to allocate two 1MiB
+ * pages and three 4KiB pages (in that order).  If big page cannot be
+ * allocated the function will still try to allocate more smaller
+ * pages.
+ */
+struct vcm_phys *__must_check
+vcm_phys_alloc(resource_size_t size, unsigned flags,
+	       const unsigned char *orders);
+
+/**
+ * vcm_phys_walk() - helper function for mapping physical pages
+ * @vaddr:	virtual address to map/unmap physical space to/from
+ * @phys:	physical space
+ * @orders:	array of orders of pages supported by the MMU sorted from
+ *		the largest to the smallest.  The last element is always
+ *		zero (which means 4K page).
+ * @callback:	function called for each page.
+ * @recover:	function called for each page when @callback returns
+ *		negative number; if it also returns negative number
+ *		function terminates; may be NULL.
+ * @priv:	private data for the callbacks.
+ *
+ * This function walks through @phys trying to mach largest possible
+ * page size donated by @orders.  For each such page @callback is
+ * called.  If @callback returns negative number the function calls
+ * @recover for each page @callback was called successfully.
+ *
+ * So, for instance, if we have a physical memory which consist of
+ * 1Mib part and 8KiB part and @orders is { 8, 0 } (which means 1MiB
+ * and 4KiB pages are to be used), @callback will be called first with
+ * 1MiB page and then two times with 4KiB page.  This is of course
+ * provided that @vaddr has correct alignment.
+ *
+ * The idea is for hardware MMU drivers to call this function and
+ * provide a callbacks for mapping/unmapping a single page.  The
+ * function divides the region into pages that the MMU can handle.
+ *
+ * If @callback at one point returns a negative number this is the
+ * return value of the function; otherwise zero is returned.
+ */
+int vcm_phys_walk(dma_addr_t vaddr, const struct vcm_phys *phys,
+		  const unsigned char *orders,
+		  int (*callback)(dma_addr_t vaddr, dma_addr_t paddr,
+				  unsigned order, void *priv),
+		  int (*recovery)(dma_addr_t vaddr, dma_addr_t paddr,
+				  unsigned order, void *priv),
+		  void *priv);
+
+#endif
+
+#endif
diff --git a/include/linux/vcm.h b/include/linux/vcm.h
new file mode 100644
index 0000000..965dc9b
--- /dev/null
+++ b/include/linux/vcm.h
@@ -0,0 +1,275 @@
+/*
+ * Virtual Contiguous Memory header
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#ifndef __LINUX_VCM_H
+#define __LINUX_VCM_H
+
+#include <linux/kref.h>
+#include <linux/compiler.h>
+
+struct vcm_driver;
+struct vcm_phys;
+
+/**
+ * struct vcm - A virtually contiguous memory context.
+ * @start:	the smallest possible address available in this context.
+ * @size:	size of available address space in bytes; internal, read
+ *		only for MMU drivers.
+ * @activations:	How many times context was activated; internal,
+ *			read only for MMU drivers.
+ * @driver:	driver handling this driver; internal.
+ *
+ * This structure represents a context of virtually contiguous memory
+ * managed by a MMU pointed by the @mmu pointer.  This is the main
+ * structure used to interact with the VCM framework.
+ *
+ * Whenever driver wants to reserve virtual address space or allocate
+ * backing storage this pointer to this structure must be passed.
+ *
+ */
+struct vcm {
+	dma_addr_t		start;
+	resource_size_t		size;
+	atomic_t		activations;
+	const struct vcm_driver	*driver;
+};
+
+/**
+ * struct vcm_res - A reserved virtually contiguous address space.
+ * @start:	bus address of the region in bytes; read only.
+ * @bound_size:	number of bytes actually bound to the virtual address;
+ *		read only.
+ * @res_size:	size of the reserved address space in bytes; read only.
+ * @vcm:	VCM context; internal, read only for MMU drivers.
+ * @phys:	pointer to physical memory bound to this reservation; NULL
+ *		if no physical memory is bound; read only.
+ *
+ * This structure represents a portion virtually contiguous address
+ * space reserved for use with the driver.  Once address space is
+ * reserved a physical memory can be bound to it so that it will paint
+ * to real memory.
+ */
+struct vcm_res {
+	dma_addr_t		start;
+	resource_size_t		bound_size;
+	resource_size_t		res_size;
+
+	struct vcm		*vcm;
+	struct vcm_phys		*phys;
+};
+
+
+/**
+ * vcm_destroy() - destroys a VCM context.
+ * @vcm:	VCM to destroy.
+ */
+void vcm_destroy(struct vcm *vcm);
+
+/**
+ * vcm_make_binding() - allocates memory and binds it to virtual address space
+ * @vcm:	VCM context to reserve virtual address space in
+ * @size:	number of bytes to allocate; aligned up to a PAGE_SIZE
+ * @alloc_flags:	additional allocator flags; see vcm_alloc() for
+ *			description of those.
+ * @res_flags:	additional reservation flags; see vcm_reserve() for
+ *		description of those.
+ *
+ * This is a call that binds together three other calls:
+ * vcm_reserve(), vcm_alloc() and vcm_bind().  The purpose of this
+ * function is that on systems with no IO MMU separate calls to
+ * vcm_alloc() and vcm_reserve() may fail whereas when called together
+ * they may work correctly.
+ *
+ * This is a consequence of the fact that with no IO MMU the simulated
+ * virtual address must be the same as physical address, thus if first
+ * virtual address space were to be reserved and then physical memory
+ * allocated, both addresses may not match.
+ *
+ * With this call, a driver that simulates IO MMU may simply allocate
+ * a physical memory and when this succeeds create correct reservation.
+ *
+ * In short, if device drivers do not need more advanced MMU
+ * functionolities, they should limit themselves to this function
+ * since then the drivers may be easily ported to systems without IO
+ * MMU.
+ *
+ * To access the vcm_phys structure created by this call a phys field
+ * of returned vcm_res structure should be used.
+ *
+ * On error returns a pointer which yields true when tested with
+ * IS_ERR().
+ */
+struct vcm_res  *__must_check
+vcm_make_binding(struct vcm *vcm, resource_size_t size,
+		 unsigned alloc_flags, unsigned res_flags);
+
+/**
+ * vcm_map() - makes a reservation and binds physical memory to it
+ * @vcm:	VCM context
+ * @phys:	physical memory to bind.
+ * @flags:	additional flags; see vcm_reserve() for	description of
+ *		those.
+ *
+ * This is a call that binds together two other calls: vcm_reserve()
+ * and vcm_bind().  If all you need is reserve address space and
+ * bind physical memory it's better to use this call since it may
+ * create better mappings in some situations.
+ *
+ * Drivers may be optimised in such a way that it won't be possible to
+ * use reservation with a different physical memory.
+ *
+ * On error returns a pointer which yields true when tested with
+ * IS_ERR().
+ */
+struct vcm_res *__must_check
+vcm_map(struct vcm *vcm, struct vcm_phys *phys, unsigned flags);
+
+/**
+ * vcm_alloc() - allocates a physical memory for use with vcm_res.
+ * @vcm:	VCM context allocation is performed in.
+ * @size:	number of bytes to allocate; aligned up to a PAGE_SIZE
+ * @flags:	additional allocator flags; XXX FIXME: describe
+ *
+ * In case of some MMU drivers, the @vcm may be important and later
+ * binding (vcm_bind()) may fail if done on another @vcm.
+ *
+ * On success returns a vcm_phys structure representing an allocated
+ * physical memory that can be bound to reserved virtual address
+ * space.  On error returns a pointer which yields true when tested with
+ * IS_ERR().
+ */
+struct vcm_phys *__must_check
+vcm_alloc(struct vcm *vcm, resource_size_t size, unsigned flags);
+
+/**
+ * vcm_free() - frees an allocated physical memory
+ * @phys:	physical memory to free.
+ *
+ * If the physical memory is bound to any reserved address space it
+ * must be unbound first.  Otherwise a warning will be issued and
+ * the memory won't be freed causing memory leaks.
+ */
+void vcm_free(struct vcm_phys *phys);
+
+/**
+ * vcm_reserve() - reserves a portion of virtual address space.
+ * @vcm:	VCM context reservation is performed in.
+ * @size:	number of bytes to allocate; aligned up to a PAGE_SIZE
+ * @flags:	additional reservation flags; XXX FIXME: describe
+ * @alignment:	required alignment of the reserved space; must be
+ *		a power of two or zero.
+ *
+ * On success returns a vcm_res structure representing a reserved
+ * (contiguous) virtual address space that physical memory can be
+ * bound to (using vcm_bind()).  On error returns a pointer which
+ * yields true when tested with IS_ERR().
+ */
+struct vcm_res *__must_check
+vcm_reserve(struct vcm *vcm, resource_size_t size, unsigned flags);
+
+/**
+ * vcm_unreserve() - destroyers a virtual address space reservation
+ * @res:	reservation to destroy.
+ *
+ * If any physical memory is bound to the reserved address space it
+ * must be unbound first.  Otherwise it will be unbound and warning
+ * will be issued.
+ */
+void vcm_unreserve(struct vcm_res *res);
+
+/**
+ * vcm_bind() - binds a physical memory to virtual address space
+ * @res:	virtual address space to bind the physical memory.
+ * @phys:	physical memory to bind to the virtual addresses.
+ *
+ * The mapping won't be active unless vcm_activate() on the VCM @res
+ * was created in context of was called.
+ *
+ * If @phys is already bound to @res this function returns -EALREADY.
+ * If some other physical memory is bound to @res -EADDRINUSE is
+ * returned.  If size of the physical memory is larger then the
+ * virtual space -ENOSPC is returned.  In all other cases the physical
+ * memory is bound to the virtual address and on success zero is
+ * returned, on error a negative number.
+ */
+int  __must_check vcm_bind(struct vcm_res *res, struct vcm_phys *phys);
+
+/**
+ * vcm_unbind() - unbinds a physical memory from virtual address space
+ * @res:	virtual address space to unbind the physical memory from.
+ *
+ * This reverses the effect of the vcm_bind() function.  Function
+ * returns physical space that was bound to the reservation (or NULL
+ * if no space was bound in which case also a warning is issued).
+ */
+struct vcm_phys *vcm_unbind(struct vcm_res *res);
+
+/**
+ * vcm_destroy_binding() - destroys the binding
+ * @res:	a bound reserved address space to destroy.
+ *
+ * This function incorporates three functions: vcm_unbind(),
+ * vcm_free() and vcm_unreserve() (in that order) in one call.
+ */
+void vcm_destroy_binding(struct vcm_res *res);
+
+/**
+ * vcm_unmap() - unbinds physical memory and unreserves address space
+ * @res:	reservation to destroy
+ *
+ * This is a call that binds together two other calls: vcm_unbind()
+ * and vcm_unreserve().
+ */
+static inline void vcm_unmap(struct vcm_res *res)
+{
+	vcm_unbind(res);
+	vcm_unreserve(res);
+}
+
+/**
+ * vcm_activate() - activates bindings in VCM.
+ * @vcm:	VCM to activate bindings in.
+ *
+ * All of the bindings on the @vcm done before this function is called
+ * are inactive and do not take effect.  The call to this function
+ * guarantees that all bindings are sent to the hardware MMU (if any).
+ *
+ * After VCM is activated all bindings will be automatically updated
+ * on the hardware MMU, so there is no need to call this function
+ * after each vcm_bind()/vcm_unbind().
+ *
+ * Each call to vcm_activate() should be later accompanied by a call
+ * to vcm_deactivate().  Otherwise a warning will be issued when VCM
+ * context is destroyed (vcm_destroy()).  This function can be called
+ * several times.
+ *
+ * On success returns zero, on error a negative error code.
+ */
+int  __must_check vcm_activate(struct vcm *vcm);
+
+/**
+ * vcm_deactivate() - deactivates bindings in VCM.
+ * @vcm:	VCM to deactivate bindings in.
+ *
+ * This function reverts effect of the vcm_activate() function.  After
+ * calling this function caller has no guarantee that bindings defined
+ * in VCM are active.
+ *
+ * If this is called without calling the vcm_activate() warning is
+ * issued.
+ */
+void vcm_deactivate(struct vcm *vcm);
+
+#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index b410910..0445f68 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -371,3 +371,33 @@ config CMA_BEST_FIT
 	  the number of allocated regions and usually much smaller).  It
 	  allocates area from the smallest hole that is big enough for
 	  allocation in question.
+
+
+config VCM
+	bool "Virtual Contiguous Memory framework"
+	help
+	  This enables the Virtual Contiguous Memory framework which
+	  provides an abstraction for virtual address space provided by
+	  various MMUs present on the platform.
+
+	  The framework uses plugable MMU drivers for hardware MMUs and
+	  if drivers obeys some limitations it can be also used on
+	  platforms with no MMU.
+
+	  For more information see
+	  <Documentation/virtual-contiguous-memory.txt>.  If unsure, say
+	  "n".
+
+# Select it if you need vcm_mmu wrapper driver
+config VCM_MMU
+	select VCM_PHYS
+	select GENERIC_ALLOCATOR
+	bool
+
+# Select if you need vcm_o2o wrapper driver
+config VCM_O2O
+	bool
+
+# Select if you need vcm_phys_alloc() or vcm_phys_walk() functions
+config VCM_PHYS
+	bool
diff --git a/mm/Makefile b/mm/Makefile
index d8c717f..e908202 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -49,3 +49,4 @@ obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
 obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
 obj-$(CONFIG_CMA) += cma.o
 obj-$(CONFIG_CMA_BEST_FIT) += cma-best-fit.o
+obj-$(CONFIG_VCM) += vcm.o
diff --git a/mm/vcm.c b/mm/vcm.c
new file mode 100644
index 0000000..ef3d1a6
--- /dev/null
+++ b/mm/vcm.c
@@ -0,0 +1,932 @@
+/*
+ * Virtual Contiguous Memory core
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#include <linux/vcm-drv.h>
+#include <linux/module.h>
+#include <linux/mm.h>
+#include <linux/err.h>
+#include <linux/slab.h>
+#include <linux/genalloc.h>
+
+#include <asm/atomic.h>
+
+/******************************** Devices API *******************************/
+
+void vcm_destroy(struct vcm *vcm)
+{
+	if (WARN_ON(atomic_read(&vcm->activations)))
+		vcm->driver->deactivate(vcm);
+
+	if (vcm->driver->cleanup)
+		vcm->driver->cleanup(vcm);
+	else
+		kfree(vcm);
+}
+EXPORT_SYMBOL_GPL(vcm_destroy);
+
+static void
+__vcm_alloc_and_reserve(struct vcm *vcm, resource_size_t size,
+			struct vcm_phys **phys, unsigned alloc_flags,
+			struct vcm_res **res, unsigned res_flags)
+{
+	int ret, alloc;
+
+	if (WARN_ON(!vcm) || !size) {
+		ret = -EINVAL;
+		goto error;
+	}
+
+	size = PAGE_ALIGN(size);
+
+	if (vcm->driver->alloc) {
+		ret = vcm->driver->alloc(vcm, size,
+					 phys, alloc_flags, res, res_flags);
+		if (ret)
+			goto error;
+		alloc = 1;
+	} else if ((res && !vcm->driver->res) || (phys && !vcm->driver->phys)) {
+		ret = -EOPNOTSUPP;
+		goto error;
+	}
+
+	if (res) {
+		if (!alloc) {
+			*res = vcm->driver->res(vcm, size, res_flags);
+			if (IS_ERR(*res)) {
+				ret = PTR_ERR(*res);
+				goto error;
+			}
+		}
+		(*res)->bound_size = 0;
+		(*res)->vcm = vcm;
+		(*res)->phys = NULL;
+	}
+
+	if (phys) {
+		if (!alloc) {
+			*phys = vcm->driver->phys(vcm, size, alloc_flags);
+			if (IS_ERR(*phys)) {
+				vcm_unreserve(*res);
+				ret = PTR_ERR(*phys);
+				goto error;
+			}
+		}
+		atomic_set(&(*phys)->bindings, 0);
+		WARN_ON(!(*phys)->free);
+	}
+
+	return;
+
+error:
+	if (phys)
+		*phys = ERR_PTR(ret);
+	if (res)
+		*res = ERR_PTR(ret);
+}
+
+struct vcm_res *__must_check
+vcm_make_binding(struct vcm *vcm, resource_size_t size,
+		 unsigned alloc_flags, unsigned res_flags)
+{
+	struct vcm_phys *phys;
+	struct vcm_res *res;
+
+	if (WARN_ON(!vcm || !size || (size & (PAGE_SIZE - 1))))
+		return ERR_PTR(-EINVAL);
+	else if (vcm->driver->alloc || !vcm->driver->map) {
+		int ret;
+
+		__vcm_alloc_and_reserve(vcm, size, &phys, alloc_flags,
+					&res, res_flags);
+
+		if (IS_ERR(res))
+			return res;
+
+		ret = vcm_bind(res, phys);
+		if (!ret)
+			return res;
+
+		if (vcm->driver->unreserve)
+			vcm->driver->unreserve(res);
+		phys->free(phys);
+		return ERR_PTR(ret);
+	} else {
+		__vcm_alloc_and_reserve(vcm, size, &phys, alloc_flags,
+					NULL, 0);
+
+		if (IS_ERR(phys))
+			return ERR_CAST(res);
+
+		res = vcm->driver->map(vcm, phys, res_flags);
+		if (IS_ERR(res))
+			phys->free(phys);
+		return res;
+	}
+}
+EXPORT_SYMBOL_GPL(vcm_make_binding);
+
+struct vcm_phys *__must_check
+vcm_alloc(struct vcm *vcm, resource_size_t size, unsigned flags)
+{
+	struct vcm_phys *phys;
+
+	__vcm_alloc_and_reserve(vcm, size, &phys, flags, NULL, 0);
+
+	return phys;
+}
+EXPORT_SYMBOL_GPL(vcm_alloc);
+
+struct vcm_res *__must_check
+vcm_reserve(struct vcm *vcm, resource_size_t size, unsigned flags)
+{
+	struct vcm_res *res;
+
+	__vcm_alloc_and_reserve(vcm, size, NULL, 0, &res, flags);
+
+	return res;
+}
+EXPORT_SYMBOL_GPL(vcm_reserve);
+
+struct vcm_res *__must_check
+vcm_map(struct vcm *vcm, struct vcm_phys *phys, unsigned flags)
+{
+	struct vcm_res *res;
+	int ret;
+
+	if (WARN_ON(!vcm))
+		return ERR_PTR(-EINVAL);
+
+	if (vcm->driver->map)
+		return vcm->driver->map(vcm, phys, flags);
+
+	res = vcm_reserve(vcm, phys->size, flags);
+	if (IS_ERR(res))
+		return res;
+
+	ret = vcm_bind(res, phys);
+	if (!ret)
+		return res;
+
+	vcm_unreserve(res);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(vcm_map);
+
+void vcm_unreserve(struct vcm_res *res)
+{
+	if (!WARN_ON(!res)) {
+		if (WARN_ON(res->phys))
+			vcm_unbind(res);
+		if (!WARN_ON_ONCE(!res->vcm->driver->unreserve))
+			res->vcm->driver->unreserve(res);
+	}
+}
+EXPORT_SYMBOL_GPL(vcm_unreserve);
+
+void vcm_free(struct vcm_phys *phys)
+{
+	if (!WARN_ON(!phys || atomic_read(&phys->bindings)) && phys->free)
+		phys->free(phys);
+}
+EXPORT_SYMBOL_GPL(vcm_free);
+
+int  __must_check vcm_bind(struct vcm_res *res, struct vcm_phys *phys)
+{
+	int ret;
+
+	if (WARN_ON(!res || !phys))
+		return -EINVAL;
+
+	if (res->phys == phys)
+		return -EALREADY;
+
+	if (res->phys)
+		return -EADDRINUSE;
+
+	if (phys->size > res->res_size)
+		return -ENOSPC;
+
+	if (res->vcm->driver->bind)
+		return -EOPNOTSUPP;
+
+	atomic_inc(&phys->bindings);
+	ret = res->vcm->driver->bind(res, phys);
+	if (ret)
+		res->bound_size = phys->size;
+	else
+		atomic_dec(&phys->bindings);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(vcm_bind);
+
+struct vcm_phys *vcm_unbind(struct vcm_res *res)
+{
+	struct vcm_phys *phys = NULL;
+	if (!WARN_ON(!res || !res->phys)) {
+		phys = res->phys;
+		if (res->vcm->driver->unbind)
+			res->vcm->driver->unbind(res);
+		atomic_dec(&phys->bindings);
+		res->phys = NULL;
+		res->bound_size = 0;
+	}
+	return phys;
+}
+EXPORT_SYMBOL_GPL(vcm_unbind);
+
+void vcm_destroy_binding(struct vcm_res *res)
+{
+	if (!WARN_ON(!res)) {
+		struct vcm_phys *phys = vcm_unbind(res);
+		if (phys)
+			vcm_free(phys);
+		vcm_unreserve(res);
+	}
+}
+EXPORT_SYMBOL_GPL(vcm_destroy_binding);
+
+int  __must_check vcm_activate(struct vcm *vcm)
+{
+	if (WARN_ON(!vcm))
+		return -EINVAL;
+	else if (atomic_inc_return(&vcm->activations) != 1
+	      || !vcm->driver->activate)
+		return 0;
+	else
+		return vcm->driver->activate(vcm);
+}
+EXPORT_SYMBOL_GPL(vcm_activate);
+
+void vcm_deactivate(struct vcm *vcm)
+{
+	if (!WARN_ON(!vcm || !atomic_read(&vcm->activations))
+	 && atomic_dec_and_test(&vcm->activations)
+	 && vcm->driver->deactivate)
+		vcm->driver->deactivate(vcm);
+}
+EXPORT_SYMBOL_GPL(vcm_deactivate);
+
+
+/****************************** VCM Drivers API *****************************/
+
+struct vcm *__must_check vcm_init(struct vcm *vcm)
+{
+	if (WARN_ON(!vcm || !vcm->size
+		 || ((vcm->start | vcm->size) & ~PAGE_MASK)
+		 || !vcm->driver || !vcm->driver->unreserve))
+		return ERR_PTR(-EINVAL);
+
+	atomic_set(&vcm->activations, 0);
+
+	return vcm;
+}
+EXPORT_SYMBOL_GPL(vcm_init);
+
+
+/*************************** Hardware MMU wrapper ***************************/
+
+#ifdef CONFIG_VCM_MMU
+
+struct vcm_mmu_res {
+	struct vcm_res			res;
+	struct list_head		bound;
+};
+
+static void vcm_mmu_cleanup(struct vcm *vcm)
+{
+	struct vcm_mmu *mmu = container_of(vcm, struct vcm_mmu, vcm);
+	WARN_ON(mutex_is_locked(&mmu->mutex));
+	gen_pool_destroy(mmu->pool);
+	if (mmu->driver->cleanup)
+		mmu->driver->cleanup(vcm);
+	else
+		kfree(mmu);
+}
+
+static struct vcm_res *
+vcm_mmu_res(struct vcm *vcm, resource_size_t size, unsigned flags)
+{
+	struct vcm_mmu *mmu = container_of(vcm, struct vcm_mmu, vcm);
+	resource_size_t s, alignment;
+	struct vcm_mmu_res *res;
+	const unsigned char *orders;
+	dma_addr_t addr;
+
+	res = kzalloc(sizeof *res, GFP_KERNEL);
+	if (!res)
+		return ERR_PTR(-ENOMEM);
+
+	/*
+	 * Use the largest alignment that makes sense for given
+	 * reservation size.  For instance, if MMU supports 1M pages
+	 * and reservation is 1M it would be nice to be able to have
+	 * reservation aligned to 1M so that if the physical memory
+	 * will consist of a single 1M block (aligned to 1M) a single
+	 * map entry will suffice.
+	 */
+	s = size >> PAGE_SHIFT;
+	for (orders = mmu->driver->orders; !(s >> *orders); ++orders)
+		/* nop */;
+	alignment = (resource_size_t)1 << (*orders + PAGE_SHIFT);
+
+	/*
+	 * We are allocating a bit more so that if allocation is not
+	 * aligned we can shift inside allocated block to get
+	 * allocation we want.
+	 */
+	s = size + alignment - PAGE_SIZE;
+
+	mutex_lock(&mmu->mutex);
+
+	addr = gen_pool_alloc(mmu->pool, s);
+
+	if (!addr) {
+		kfree(res);
+		res = ERR_PTR(-ENOSPC);
+	} else if (alignment > PAGE_SIZE) {
+		/*
+		 * Align the reservation.  We can safely do this since
+		 * we have allocated more memory then we needed and we
+		 * can move reservation around.
+		 */
+		dma_addr_t start = ALIGN(addr, alignment);
+
+		/* Free unused memory. */
+		if (start != addr)
+			gen_pool_free(mmu->pool, addr, start - addr);
+		if (start + size != addr + s)
+			gen_pool_free(mmu->pool, start + size,
+				      addr + s - (start + size));
+		addr = start;
+	}
+
+	mutex_unlock(&mmu->mutex);
+
+	if (!IS_ERR(res)) {
+		INIT_LIST_HEAD(&res->bound);
+		res->res.start = addr;
+		res->res.res_size = size;
+	}
+
+	return &res->res;
+}
+
+static struct vcm_phys *
+vcm_mmu_phys(struct vcm *vcm, resource_size_t size, unsigned flags)
+{
+	return vcm_phys_alloc(size, flags,
+			      container_of(vcm, struct vcm_mmu,
+					   vcm)->driver->orders);
+}
+
+static int __must_check
+__vcm_mmu_activate(struct vcm_res *res, struct vcm_phys *phys)
+{
+	struct vcm_mmu *mmu = container_of(res->vcm, struct vcm_mmu, vcm);
+	if (mmu->driver->activate)
+		return mmu->driver->activate(res, phys);
+
+	return vcm_phys_walk(res->start, phys, mmu->driver->orders,
+			     mmu->driver->activate_page,
+			     mmu->driver->deactivate_page, res->vcm);
+}
+
+static void __vcm_mmu_deactivate(struct vcm_res *res, struct vcm_phys *phys)
+{
+	struct vcm_mmu *mmu = container_of(res->vcm, struct vcm_mmu, vcm);
+	if (mmu->driver->deactivate)
+		return mmu->driver->deactivate(res, phys);
+
+	vcm_phys_walk(res->start, phys, mmu->driver->orders,
+		      mmu->driver->deactivate_page, NULL, res->vcm);
+}
+
+static int vcm_mmu_bind(struct vcm_res *_res, struct vcm_phys *phys)
+{
+	struct vcm_mmu_res *res = container_of(_res, struct vcm_mmu_res, res);
+	struct vcm_mmu *mmu = container_of(_res->vcm, struct vcm_mmu, vcm);
+	int ret;
+
+	mutex_lock(&mmu->mutex);
+
+	if (mmu->activated) {
+		ret = __vcm_mmu_activate(_res, phys);
+		if (ret)
+			goto done;
+	}
+
+	list_add_tail(&res->bound, &mmu->bound_res);
+	ret = 0;
+
+done:
+	mutex_unlock(&mmu->mutex);
+
+	return ret;
+}
+
+static void vcm_mmu_unbind(struct vcm_res *_res)
+{
+	struct vcm_mmu_res *res = container_of(_res, struct vcm_mmu_res, res);
+	struct vcm_mmu *mmu = container_of(_res->vcm, struct vcm_mmu, vcm);
+
+	mutex_lock(&mmu->mutex);
+
+	if (mmu->activated)
+		__vcm_mmu_deactivate(_res, _res->phys);
+
+	list_del_init(&res->bound);
+
+	mutex_unlock(&mmu->mutex);
+}
+
+static void vcm_mmu_unreserve(struct vcm_res *res)
+{
+	struct vcm_mmu *mmu = container_of(res->vcm, struct vcm_mmu, vcm);
+	mutex_lock(&mmu->mutex);
+	gen_pool_free(mmu->pool, res->start, res->res_size);
+	mutex_unlock(&mmu->mutex);
+}
+
+static int vcm_mmu_activate(struct vcm *vcm)
+{
+	struct vcm_mmu *mmu = container_of(vcm, struct vcm_mmu, vcm);
+	struct vcm_mmu_res *r, *rr;
+	int ret;
+
+	mutex_lock(&mmu->mutex);
+
+	list_for_each_entry(r, &mmu->bound_res, bound) {
+		ret = __vcm_mmu_activate(&r->res, r->res.phys);
+		if (ret < 0)
+			continue;
+
+		list_for_each_entry(rr, &mmu->bound_res, bound) {
+			if (r == rr)
+				goto done;
+			__vcm_mmu_deactivate(&rr->res, rr->res.phys);
+		}
+	}
+
+	mmu->activated = 1;
+	ret = 0;
+
+done:
+	mutex_unlock(&mmu->mutex);
+
+	return ret;
+}
+
+static void vcm_mmu_deactivate(struct vcm *vcm)
+{
+	struct vcm_mmu *mmu = container_of(vcm, struct vcm_mmu, vcm);
+	struct vcm_mmu_res *r;
+
+	mutex_lock(&mmu->mutex);
+
+	mmu->activated = 0;
+
+	list_for_each_entry(r, &mmu->bound_res, bound)
+		mmu->driver->deactivate(&r->res, r->res.phys);
+
+	mutex_unlock(&mmu->mutex);
+}
+
+struct vcm *__must_check vcm_mmu_init(struct vcm_mmu *mmu)
+{
+	static const struct vcm_driver driver = {
+		.cleanup	= vcm_mmu_cleanup,
+		.res		= vcm_mmu_res,
+		.phys		= vcm_mmu_phys,
+		.bind		= vcm_mmu_bind,
+		.unbind		= vcm_mmu_unbind,
+		.unreserve	= vcm_mmu_unreserve,
+		.activate	= vcm_mmu_activate,
+		.deactivate	= vcm_mmu_deactivate,
+	};
+
+	struct vcm *vcm;
+	int ret;
+
+	if (WARN_ON(!mmu || !mmu->driver ||
+		    !(mmu->driver->activate ||
+		      (mmu->driver->activate_page &&
+		       mmu->driver->deactivate_page)) ||
+		    !(mmu->driver->deactivate ||
+		      mmu->driver->deactivate_page)))
+		return ERR_PTR(-EINVAL);
+
+	mmu->vcm.driver = &driver;
+	vcm = vcm_init(&mmu->vcm);
+	if (IS_ERR(vcm))
+		return vcm;
+
+	mmu->pool = gen_pool_create(PAGE_SHIFT, -1);
+	if (!mmu->pool)
+		return ERR_PTR(-ENOMEM);
+
+	ret = gen_pool_add(mmu->pool, mmu->vcm.start, mmu->vcm.size, -1);
+	if (ret) {
+		gen_pool_destroy(mmu->pool);
+		return ERR_PTR(ret);
+	}
+
+	vcm->driver     = &driver;
+	INIT_LIST_HEAD(&mmu->bound_res);
+	mutex_init(&mmu->mutex);
+
+	return &mmu->vcm;
+}
+EXPORT_SYMBOL_GPL(vcm_mmu_init);
+
+#endif
+
+/**************************** One-to-One wrapper ****************************/
+
+#ifdef CONFIG_VCM_O2O
+
+struct vcm_o2o_binding {
+	void			*priv;
+	unsigned long		dead[1];
+	struct vcm_res		res;
+	struct vcm_phys		phys;
+	/* vcm_phys is variable length, don't put anything at the end */
+};
+
+static void vcm_o2o_cleanup(struct vcm *vcm)
+{
+	struct vcm_o2o *o2o = container_of(vcm, struct vcm_o2o, vcm);
+	if (o2o->driver->cleanup)
+		o2o->driver->cleanup(vcm);
+	else
+		kfree(o2o);
+}
+
+static void vcm_o2o_free(struct vcm_phys *phys)
+{
+	struct vcm_o2o_binding *b =
+		container_of(phys, struct vcm_o2o_binding, phys);
+	struct vcm_o2o *o2o =
+		container_of(b->res.vcm, struct vcm_o2o, vcm);
+	o2o->driver->free(phys->parts, b->priv);
+	if (test_and_set_bit(0, b->dead))
+		kfree(b);
+}
+
+static void vcm_o2o_unreserve(struct vcm_res *res)
+{
+	struct vcm_o2o_binding *b =
+		container_of(res, struct vcm_o2o_binding, res);
+	if (test_and_set_bit(0, b->dead))
+		kfree(b);
+}
+
+static struct vcm_phys *
+vcm_o2o_phys(struct vcm *vcm, resource_size_t size, unsigned flags)
+{
+	struct vcm_o2o *o2o = container_of(vcm, struct vcm_o2o, vcm);
+	struct vcm_o2o_binding *b;
+	void *priv;
+
+	b = kmalloc(sizeof *b + sizeof *b->phys.parts, GFP_KERNEL);
+	if (!b)
+		return ERR_PTR(-ENOMEM);
+
+	b->phys.parts->start = 0;
+	b->phys.parts->size  = size;
+	priv = o2o->driver->alloc(vcm, b->phys.parts, flags);
+	if (IS_ERR(priv)) {
+		kfree(b);
+		return ERR_CAST(priv);
+	}
+
+	if (WARN_ON(!b->phys.parts->size ||
+		    (b->phys.parts->start | b->phys.parts->size)
+		  & ~PAGE_MASK)) {
+		o2o->driver->free(b->phys.parts, b->priv);
+		kfree(b);
+		return ERR_PTR(-EINVAL);
+	}
+
+	b->priv		= priv;
+	b->dead[0]	= ~0;
+	b->res.start	= b->phys.parts->start;
+	b->res.res_size	= b->phys.parts->size;
+	b->phys.size	= b->phys.parts->size;
+	b->phys.count	= 1;
+	b->phys.free	= vcm_o2o_free;
+
+	return &b->phys;
+}
+
+static struct vcm_res *
+vcm_o2o_map(struct vcm *vcm, struct vcm_phys *phys, unsigned flags)
+{
+	struct vcm_o2o_binding *b =
+		container_of(phys, struct vcm_o2o_binding, phys);
+
+	if (!test_and_clear_bit(0, b->dead))
+		return ERR_PTR(-EBUSY);
+
+	return &b->res;
+}
+
+static int vcm_o2o_bind(struct vcm_res *res, struct vcm_phys *phys)
+{
+	struct vcm_o2o_binding *b =
+		container_of(res, struct vcm_o2o_binding, res);
+
+	if (&b->phys != phys)
+		return -EOPNOTSUPP;
+
+	if (WARN_ON(test_bit(0, b->dead)))
+		return -EINVAL;
+
+	return 0;
+}
+
+struct vcm *__must_check vcm_o2o_init(struct vcm_o2o *o2o)
+{
+	static const struct vcm_driver driver = {
+		.cleanup	= vcm_o2o_cleanup,
+		.phys		= vcm_o2o_phys,
+		.map		= vcm_o2o_map,
+		.bind		= vcm_o2o_bind,
+		.unreserve	= vcm_o2o_unreserve,
+	};
+
+	if (WARN_ON(!o2o || !o2o->driver ||
+		    !o2o->driver->alloc || !o2o->driver->free))
+		return ERR_PTR(-EINVAL);
+
+	o2o->vcm.driver = &driver;
+	return vcm_init(&o2o->vcm);
+}
+EXPORT_SYMBOL_GPL(vcm_o2o_init);
+
+#endif
+
+/************************ Physical memory management ************************/
+
+#ifdef CONFIG_VCM_PHYS
+
+struct vcm_phys_list {
+	struct vcm_phys_list	*next;
+	unsigned		count;
+	struct vcm_phys_part	parts[31];
+};
+
+static struct vcm_phys_list *__must_check
+vcm_phys_alloc_list_order(struct vcm_phys_list *last, resource_size_t *pages,
+			  unsigned flags, unsigned order, unsigned *total)
+{
+	unsigned count;
+
+	count	= *pages >> order;
+
+	/* So, we need count order-order pages */
+	do {
+		struct page *p = alloc_pages(GFP_DMA, order);
+
+		if (!p)
+			/*
+			 * If allocation failed we may still
+			 * try to continua allocating smaller
+			 * pages.
+			 */
+			break;
+
+		if (last->count == ARRAY_SIZE(last->parts)) {
+			struct vcm_phys_list *l;
+			l = kmalloc(sizeof *l, GFP_KERNEL);
+			if (!l)
+				return NULL;
+
+			l->next = NULL;
+			l->count = 0;
+			last->next = l;
+			last = l;
+		}
+
+		last->parts[last->count].start =
+			page_to_pfn(p) << PAGE_SHIFT;
+		last->parts[last->count].size =
+			(resource_size_t)1 << (order + PAGE_SHIFT);
+		++last->count;
+		++*total;
+		*pages -= 1 << order;
+	} while (--count);
+
+	return last;
+}
+
+static unsigned __must_check
+vcm_phys_alloc_list(struct vcm_phys_list *first,
+		    resource_size_t size, unsigned flags,
+		    const unsigned char *orders)
+{
+	struct vcm_phys_list *last = first;
+	unsigned total_parts = 0;
+	resource_size_t pages;
+
+	/*
+	 * We are trying to allocate as large pages as possible but
+	 * not larger then pages that MMU driver that called us
+	 * supports (ie. the ones provided by page_sizes).  This makes
+	 * it possible to map the region using fewest possible number
+	 * of entries.
+	 */
+	pages = size >> PAGE_SHIFT;
+	do {
+		while (!(pages >> *orders))
+			++orders;
+
+		last = vcm_phys_alloc_list_order(last, &pages, flags, *orders,
+						 &total_parts);
+		if (!last)
+			return 0;
+
+	} while (*orders++ && pages);
+
+	if (pages)
+		return 0;
+
+	return total_parts;
+}
+
+static void vcm_phys_free_parts(struct vcm_phys_part *parts, unsigned count)
+{
+	do {
+		free_pages(parts->start, ffs(parts->size) - 1 - PAGE_SHIFT);
+		++parts;
+	} while (--count);
+}
+
+static void vcm_phys_alloc_cleanup(struct vcm_phys_list *lst)
+{
+	struct vcm_phys_list *first = lst;
+	do {
+		struct vcm_phys_list *l;
+
+		vcm_phys_free_parts(lst->parts, lst->count);
+
+		l = lst->next;
+		if (lst != first)
+			kfree(lst);
+		lst = l;
+	} while (lst);
+}
+
+static void vcm_phys_free(struct vcm_phys *phys)
+{
+	vcm_phys_free_parts(phys->parts, phys->count);
+}
+
+struct vcm_phys *__must_check
+vcm_phys_alloc(resource_size_t size, unsigned flags,
+	       const unsigned char *orders)
+{
+	struct vcm_phys_list first = { NULL, 0 }, *lst;
+	struct vcm_phys_part *out;
+	struct vcm_phys *phys;
+	unsigned count;
+
+	if (WARN_ON((size & (PAGE_SIZE - 1)) || !size || !orders))
+		return ERR_PTR(-EINVAL);
+
+	count = vcm_phys_alloc_list(&first, size, flags, orders);
+	if (!count)
+		goto error;
+
+	phys = kmalloc(sizeof *phys + count * sizeof *phys->parts, GFP_KERNEL);
+	if (!phys)
+		goto error;
+
+	phys->free = vcm_phys_free;
+	phys->count = count;
+	phys->size = size;
+
+	out = phys->parts;
+	lst = &first;
+	do {
+		struct vcm_phys_list *l;
+
+		memcpy(out, lst->parts, lst->count * sizeof *out);
+		out += lst->count;
+
+		l = lst->next;
+		if (lst != &first)
+			kfree(lst);
+		lst = l;
+	} while (lst);
+
+	return phys;
+
+error:
+	vcm_phys_alloc_cleanup(&first);
+	return ERR_PTR(-ENOMEM);
+}
+EXPORT_SYMBOL_GPL(vcm_phys_alloc);
+
+static inline bool is_of_order(dma_addr_t size, unsigned order)
+{
+	return !(size & (((dma_addr_t)PAGE_SIZE << order) - 1));
+}
+
+static int
+__vcm_phys_walk_part(dma_addr_t vaddr, const struct vcm_phys_part *part,
+		     const unsigned char *orders,
+		     int (*callback)(dma_addr_t vaddr, dma_addr_t paddr,
+				     unsigned order, void *priv), void *priv,
+		     unsigned *limit)
+{
+	resource_size_t size = part->size;
+	dma_addr_t paddr = part->start;
+	resource_size_t ps;
+
+	while (!is_of_order(vaddr, *orders))
+		++orders;
+	while (!is_of_order(paddr, *orders))
+		++orders;
+
+	ps = PAGE_SIZE << *orders;
+	for (; *limit && size; --*limit) {
+		int ret;
+
+		while (ps > size)
+			ps = PAGE_SIZE << *++orders;
+
+		ret = callback(vaddr, paddr, *orders, priv);
+		if (ret < 0)
+			return ret;
+
+		ps = PAGE_SIZE << *orders;
+		vaddr += ps;
+		paddr += ps;
+		size  -= ps;
+	}
+
+	return 0;
+}
+
+int vcm_phys_walk(dma_addr_t _vaddr, const struct vcm_phys *phys,
+		  const unsigned char *orders,
+		  int (*callback)(dma_addr_t vaddr, dma_addr_t paddr,
+				  unsigned order, void *arg),
+		  int (*recovery)(dma_addr_t vaddr, dma_addr_t paddr,
+				  unsigned order, void *arg),
+		  void *priv)
+{
+	unsigned limit = ~0;
+	int r = 0;
+
+	if (WARN_ON(!phys || ((_vaddr | phys->size) & (PAGE_SIZE - 1)) ||
+		    !phys->size || !orders || !callback))
+		return -EINVAL;
+
+	for (;;) {
+		const struct vcm_phys_part *part = phys->parts;
+		unsigned count = phys->count;
+		dma_addr_t vaddr = _vaddr;
+		int ret = 0;
+
+		for (; count && limit; --count, ++part) {
+			ret = __vcm_phys_walk_part(vaddr, part, orders,
+						   callback, priv, &limit);
+			if (ret)
+				break;
+
+			vaddr += part->size;
+		}
+
+		if (r)
+			/* We passed error recovery */
+			return r;
+
+		/*
+		 * Either operation suceeded or we were not provided
+		 * with a recovery callback -- return.
+		 */
+		if (!ret || !recovery)
+			return ret;
+
+		/* Switch to recovery */
+		limit = ~0 - limit;
+		callback = recovery;
+		r = ret;
+	}
+}
+EXPORT_SYMBOL_GPL(vcm_phys_walk);
+
+#endif
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 7/9] mm: vcm: Virtual Contiguous Memory framework added
@ 2010-09-06  6:33   ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel

This commit adds the Virtual Contiguous Memory framework which
provides an abstraction for virtual address space provided by
various MMUs present on the platform.

The framework uses plugable MMU drivers for hardware MMUs and
if drivers obeys some limitations it can be also used on
platforms with no MMU.

For more information see
<Documentation/virtual-contiguous-memory.txt>.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 Documentation/00-INDEX                      |    2 +
 Documentation/virtual-contiguous-memory.txt |  853 ++++++++++++++++++++++++
 include/linux/vcm-drv.h                     |  299 +++++++++
 include/linux/vcm.h                         |  275 ++++++++
 mm/Kconfig                                  |   30 +
 mm/Makefile                                 |    1 +
 mm/vcm.c                                    |  932 +++++++++++++++++++++++++++
 7 files changed, 2392 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/virtual-contiguous-memory.txt
 create mode 100644 include/linux/vcm-drv.h
 create mode 100644 include/linux/vcm.h
 create mode 100644 mm/vcm.c

diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
index f93e787..7c79ffa 100644
--- a/Documentation/00-INDEX
+++ b/Documentation/00-INDEX
@@ -344,6 +344,8 @@ video-output.txt
 	- sysfs class driver interface to enable/disable a video output device.
 video4linux/
 	- directory with info regarding video/TV/radio cards and linux.
+virtual-contiguous-memory.txt
+	- documentation on virtual contiguous memory manager framework.
 vm/
 	- directory with info on the Linux vm code.
 volatile-considered-harmful.txt
diff --git a/Documentation/virtual-contiguous-memory.txt b/Documentation/virtual-contiguous-memory.txt
new file mode 100644
index 0000000..0c0e90c
--- /dev/null
+++ b/Documentation/virtual-contiguous-memory.txt
@@ -0,0 +1,853 @@
+                                                             -*- org -*-
+
+This document covers how to use the Virtual Contiguous Memory Manager
+(VCMM), how the implementation works, and how to implement MMU drivers
+that can be plugged into VCMM.  It also contains a rationale for VCMM.
+
+* The Virtual Contiguous Memory Manager
+
+The VCMM was built to solve the system-wide memory mapping issues that
+occur when many bus-masters have IOMMUs.
+
+An IOMMU maps device addresses to physical addresses.  It also
+insulates the system from spurious or malicious device bus
+transactions and allows fine-grained mapping attribute control.  The
+Linux kernel core does not contain a generic API to handle IOMMU
+mapped memory; device driver writers must implement device specific
+code to interoperate with the Linux kernel core.  As the number of
+IOMMUs increases, coordinating the many address spaces mapped by all
+discrete IOMMUs becomes difficult without in-kernel support.
+
+The VCMM API enables device independent IOMMU control, virtual memory
+manager (VMM) interoperation and non-IOMMU enabled device
+interoperation by treating devices with or without IOMMUs and all CPUs
+with or without MMUs, their mapping contexts and their mappings using
+common abstractions.  Physical hardware is given a generic device type
+and mapping contexts are abstracted into Virtual Contiguous Memory
+(VCM) regions.  Users "reserve" memory from VCMs and "bind" their
+reservations with physical memory.
+
+If drivers limit their use of VCM contexts to a some subset of VCMM
+functionality, they can work with no changes with or without MMU.
+
+** Why the VCMM is Needed
+
+Driver writers who control devices with IOMMUs must contend with
+device control and memory management.  Driver writers have a large
+device driver API that they can leverage to control their devices, but
+they are lacking a unified API to help them program mappings into
+IOMMUs and share those mappings with other devices and CPUs in the
+system.
+
+Sharing is complicated by Linux's CPU-centric VMM.  The CPU-centric
+model generally makes sense because average hardware only contains
+a MMU for the CPU and possibly a graphics MMU.  If every device in the
+system has one or more MMUs the CPU-centric memory management (MM)
+programming model breaks down.
+
+Abstracting IOMMU device programming into a common API has already
+begun in the Linux kernel.  It was built to abstract the difference
+between AMD and Intel IOMMUs to support x86 virtualization on both
+platforms.  The interface is listed in include/linux/iommu.h.  It
+contains interfaces for mapping and unmapping as well as domain
+management.  This interface has not gained widespread use outside the
+x86; PA-RISC, Alpha and SPARC architectures and ARM and PowerPC
+platforms all use their own mapping modules to control their IOMMUs.
+The VCMM contains an IOMMU programming layer, but since its
+abstraction supports map management independent of device control, the
+layer is not used directly.  This higher-level view enables a new
+kernel service, not just an IOMMU interoperation layer.
+
+** The General Idea: Map Management using Graphs
+
+Looking at mapping from a system-wide perspective reveals a general
+graph problem.  The VCMM's API is built to manage the general mapping
+graph.  Each node that talks to memory, either through an MMU or
+directly (physically mapped) can be thought of as the device-end of
+a mapping edge.  The other edge is the physical memory (or
+intermediate virtual space) that is mapped.  The figure below shows
+an example three with CPU and a few devices connected to the memory
+directly or through a MMU.
+
++--------------------------------------------------------------------+
+|                               Memory                               |
++--------------------------------------------------------------------+
+                                  |
+   +------------------+-----------+-------+----------+-----------+
+   |                  |                   |          |           |
++-----+            +-----+             +-----+  +--------+  +--------+
+| MMU |            | MMU |             | MMU |  | Device |  | Device |
++-----+            +-----+             +-----+  +--------+  +--------+
+   |                  |                   |
++-----+       +-------+---+-----....   +-----+
+| CPU |       |           |            | GPU |
++-----+  +--------+  +--------+        +-----+
+         | Device |  | Device |  ...
+         +--------+  +--------+
+
+For each MMU in the system a VCM context is created through an through
+which drivers can make reservations and bind virtual addresses to
+physical space.  In the direct-mapped case the device is assigned
+a one-to-one MMU (as shown on the figure below). This scheme allows
+direct mapped devices to participate in general graph management.
+
++--------------------------------------------------------------------+
+|                               Memory                               |
++--------------------------------------------------------------------+
+                                  |
+   +------------------+-----------+-------+----------------+
+   |                  |                   |                |
++-----+            +-----+             +-----+      +------------+
+| MMU |            | MMU |             | MMU |      | One-to-One |
++-----+            +-----+             +-----+      +------------+
+   |                  |                   |                |
++-----+       +-------+---+-----....   +-----+       +-----+-----+
+| CPU |       |           |            | GPU |       |           |
++-----+  +--------+  +--------+        +-----+  +--------+  +--------+
+         | Device |  | Device |  ...            | Device |  | Device |
+         +--------+  +--------+                 +--------+  +--------+
+
+The CPU nodes can also be brought under the same mapping abstraction
+with the use of a light overlay on the existing VMM. This light
+overlay allows VCMM-managed mappings to interoperate with the common
+API.  The light overlay enables this without substantial modifications
+to the existing VMM.
+
+In addition to CPU nodes that are running Linux (and the VMM), remote
+CPU nodes that may be running other operating systems can be brought
+into the general abstraction.  Routing all memory management requests
+from a remote node through the central memory management framework
+enables new features like system-wide memory migration.  This feature
+may only be feasible for large buffers that are managed outside of the
+fast-path, but having remote allocation in a system enables features
+that are impossible to build without it.
+
+The fundamental objects that support graph-based map management are:
+Virtual Contiguous Memory contexts, reservations, and physical memory
+allocations.
+
+* Usage Overview
+
+In a nutshell, platform initialises VCM context for each MMU on the
+system and possibly one-to-one VCM contexts which are passed to device
+drivers.  Later on, drivers make reservation of virtual address space
+from the VCM context.  At this point no physical memory has been
+committed to the reservation.  To bind physical memory with a
+reservation, physical memory is allocated (possibly discontiguous) and
+then bound to the reservation.
+
+Single physical allocation can be bound to several different
+reservations also from different VCM contexts.  This allows for
+devices connected through different MMUs (or directly) to the memory
+banks to share physical memory buffers; this also lets it possible to
+map such memory into CPU's address space (be it kernel or user space)
+so that the same data can be accessed by the CPU.
+
+[[file:../include/linux/vcm.h][include/linux/vcm.h]] includes comments documenting each API.
+
+** Virtual Contiguous Memory context
+
+A Virtual Contiguous Memory context (VCM) abstracts an address space
+a device sees.  A VCM is created with a VCM driver dependent call.  It
+is destroyed with a call to:
+
+        void vcm_destroy(struct vcm *vcm);
+
+The newly created VCM instance can be passed to any function that needs to
+operate on or with a virtual contiguous memory region.  All internals
+of the VCM driver and how the mappings are handled is hidden and VCM
+driver dependent.
+
+** Bindings
+
+If all that driver needs is allocate some physical space and map it
+into its address space, a vcm_make_binding() call can be used:
+
+	struct vcm_res	*__must_check
+	vcm_make_binding(struct vcm *vcm, resource_size_t size,
+			 unsigned alloc_flags, unsigned res_flags);
+
+This call allocates physical memory, reserves virtual address space
+and binds those together.  If all those succeeds a reservation is
+returned which has physical memory associated with it.
+
+If driver does not require more complicated VCMM functionality, it is
+desirable to use this function since it will work on both real MMUs
+and one-to-one mappings.
+
+To destroy created binding, vcm_destroy_binding() can be used:
+
+        void vcm_destroy_binding(struct vcm_res *res);
+
+** Physical memory
+
+Physical memory allocations are handled using the following functions:
+
+	struct vcm_phys *__must_check
+	vcm_alloc(struct vcm *vcm, resource_size_t size, unsigned flags);
+
+	void vcm_free(struct vcm_phys *phys);
+
+It is noteworthy that physical space allocation is done in the context
+of a VCM.  This is especially important in case of one-to-one VCM
+contexts which cannot handle discontiguous physical memory.
+
+Also, depending on VCM context, the physical space may be allocated in
+parts of different sizes.  For instance, if a given MMU supports
+16MiB, 1MiB, 64KiB and 4KiB pages, it is likely that vcm_alloc() in
+context of this MMU's driver will try to split into as few as possible
+parts of those sizes.
+
+In case of one-to-one VCM contexts, a physical memory allocated with
+the call to vcm_alloc() may be usable only with vcm_map() function.
+
+** Mappings
+
+The easiest way to map a physical space into virtual address space
+represented by VCM context is to use the vcm_map() function:
+
+	struct vcm_res *__must_check
+	vcm_map(struct vcm *vcm, struct vcm_phys *phys, unsigned flags);
+
+This functions reserves address space from VCM context and binds
+physical space to it.  To reverse the process vcm_unmap() can be used:
+
+	void vcm_unmap(struct vcm_res *res);
+
+Similarly to vcm_make_binding(), Usage vcm_map() may be advantageous
+over the use of vcm_reserve() followed by vcm_bind().  This is not
+only true for one-to-one mapping but if it so happens that the call to
+vcm_map() request mapping of a physically contiguous space into kernel
+space, a direct mapping can be returned instead of creating a new one.
+
+In some cases, a reservation created with vcm_map() can be used only
+with the physical memory passed as the argument to vcm_map() (so if
+user chooses to call vcm_unbind() and then vcm_bind() on a different
+physical memory, the call may fail).
+
+** Reservations
+
+A reservation is a contiguous region allocated from a virtual address
+space represented by VCM context.  Just after reservation is created,
+no physical memory needs to be is bound to it.  To manage reservations
+following two functions are provided:
+
+	struct vcm_res *__must_check
+	vcm_reserve(struct vcm *vcm, resource_size_t size,
+		    unsigned flags);
+
+	void vcm_unreserve(struct vcm_res *res);
+
+The first one creates a reservation of desired size, and the second
+one destroys it.
+
+** Binding memory
+
+To bind a physical memory into a reservation vcm_bind() function is
+used:
+
+	int __must_check vcm_bind(struct vcm_res *res,
+				  struct vcm_phys *phys);
+
+When the binding is no longer needed, vcm_unbind() destroys the
+connection:
+
+	struct vcm_phys *vcm_unbind(struct vcm_res *res);
+
+** Activating mappings
+
+Unless a VCM context is activated, none of the bindings are actually
+guaranteed to be available.  When device driver needs the mappings
+it need to call vcm_activate() function to guarantee that the mappings
+are sent to hardware MMU.
+
+	int  __must_check vcm_activate(struct vcm *vcm);
+
+After VCM context is activated all further bindings (made with
+vcm_make_binding(), vcm_map() or vcm_bind()) will be updated so there
+is no need to call vcm_activate() after each binding is done or
+undone.
+
+To deactivate the VCM context vcm_deactivate() function is used:
+
+	void vcm_deactivate(struct vcm *vcm);
+
+Both of those functions can be called several times if all calls to
+vcm_activate() are paired with a later call to vcm_deactivate().
+
+** Device driver example
+
+The following is a simple, untested example of how platform and
+devices work together to use the VCM framework.  Platform initialises
+contexts for each MMU in the systems, and through platform device data
+passes them to correct drivers.
+
+Device driver header file:
+
+	struct foo_platform_data {
+		/* ... */
+		struct vcm	*vcm;
+		/* ... */
+	};
+
+Platform code:
+
+	static int plat_bar_vcm_init(void)
+	{
+		struct foo_platform_data *fpdata;
+		struct vcm *vcm;
+
+		vcm = vcm_baz_create(...);
+		if (IS_ERR(vcm))
+			return PTR_ERR(vcm);
+
+		fpdata = dev_get_platdata(&foo_device.dev);
+		fpdata->vcm = vcm;
+
+		/* ... */
+
+		return 0;
+	}
+
+Device driver implementation:
+
+	struct foo_private {
+		/* ... */
+		struct vcm_res	*fw;
+		/* ... */
+	};
+
+	static inline struct vcm_res *__must_check
+	__foo_alloc(struct device *dev, size_t size)
+	{
+		struct foo_platform_data *pdata =
+			dev_get_platdata(dev);
+		return vcm_make_binding(pdata->vcm, size, 0, 0);
+	}
+
+	static inline void __foo_free(struct vcm_res *res)
+	{
+		vcm_destroy_binding(res);
+	}
+
+	static int foo_probe(struct device *dev)
+	{
+		struct foo_platform_data *pdata =
+			dev_get_platdata(dev);
+		struct foo_private *priv;
+
+		if (IS_ERR_OR_NULL(pdata->vcm))
+			return pdata->vcm ? PTR_ERR(pdata->vcm) : -EINVAL;
+
+		priv = kzalloc(sizeof *priv, GFP_KERNEL);
+		if (!priv)
+			return -ENOMEM;
+
+		/* ... */
+
+		priv->fw = __foo_alloc(dev, 1 << 20);
+		if (IS_ERR(priv->fw)) {
+			kfree(priv);
+			return PTR_ERR(priv->fw);
+		}
+		/* copy firmware to fw */
+
+		vcm_activate(pdata->vcm);
+
+		dev->p = priv;
+
+		return 0;
+	}
+
+	static int foo_remove(struct device *dev)
+	{
+		struct foo_platform_data *pdata =
+			dev_get_platdata(dev);
+		struct foo_private *priv = dev->p;
+
+		/* ... */
+
+		vcm_deactivate(pdata->vcm);
+		__foo_free(priv->fw);
+
+		kfree(priv);
+
+		return 0;
+	}
+
+	static int foo_do_something(struct device *dev, /* ... */)
+	{
+		struct foo_platform_data *pdata =
+			dev_get_platdata(dev);
+		struct vcm_res *buf;
+		int ret;
+
+		buf = __foo_alloc(/* ... size ...*/);
+		if (IS_ERR(buf))
+			return ERR_PTR(buf);
+
+		/*
+		 * buf->start is address visible from device's
+		 * perspective.
+		 */
+
+		/* ... set hardware up ... */
+
+		/* ... wait for completion ... */
+
+		__foo_free(buf);
+
+		return ret;
+	}
+
+In the above example only vcm_make_binding() function is used so that
+the above scheme will work not only for systems with MMU but also in
+case of one-to-one VCM context.
+
+** IOMMU, one-to-one and VMM contexts
+
+The following example demonstrates mapping IOMMU, one-to-one and VMM
+reservations to the same physical memory.  For readability, error
+handling is not shown on the listings.
+
+First, each contexts needs to be created.  A call used for creating
+context is dependent on the driver used.  The following is just an
+example of how this could look like:
+
+	struct vcm *vcm_vmm, *vcm_onetoone, *vcm_iommu;
+
+	vcm_vmm      = vcm_vmm_create();
+	vcm_onetoone = vcm_onetoone_create();
+	vcm_iommu    = vcm_foo_mmu_create();
+
+Once contexts are created, physical space needs to be allocated,
+reservations made on each context and physical memory mapped to those
+reservations.  Because there is a one-to-one context, the memory has
+to be allocated from its context.  It's also best to map the memory in
+the single call using vcm_make_binding():
+
+	struct vcm_res *res_onetoone;
+
+	res_onetoone = vcm_make_binding(vcm_o2o, SZ_2MB | SZ_4K, 0, 0);
+
+What's left is map the space in the other two contexts.  If the
+reservation in the other two contexts won't be used for any other
+purpose then to reference the memory allocated in above, it's best to
+use vcm_map():
+
+	struct vcm_res *res_vcm, *res_iommu;
+
+	res_vmm = vcm_map(vcm_vmm, res_onetoone->phys, 0);
+	res_iommu = vcm_map(vcm_iommu, res_onetoone->phys, 0);
+
+Once the bindings have been created, the contexts need to be activated
+to make sure that they are actually on the hardware. (In case of
+one-to-one mapping it's most likely a no-operation but it's still
+required by the VCMM API so it must not be omitted.)
+
+	vcm_activate(vcm_vmm);
+	vcm_activate(vcm_onetoone);
+	vcm_activate(vcm_iommu);
+
+At this point, all three reservations represent addresses in
+respective address space that is bound to a physical memory.  Not only
+CPU can access it now but also devices connected through the MMU, as
+well as devices connected directly to the memory banks.  The bus
+address for the devices and virtual address for the CPU is available
+through the 'start' member of the vcm_res structure (ie. res_* objects
+above).
+
+Once the mapping is no longer used and memory no longer needed it can
+be freed as follows:
+
+	vcm_unmap(res_vmm);
+	vcm_unmap(res_iommu);
+	vcm_destroy_binding(res_onetoone);
+
+If the contexts are not needed either, they can be disabled:
+
+	vcm_deactivate(vcm_vmm);
+	vcm_deactivate(vcm_iommu);
+	vcm_deactivate(vcm_onetoone);
+
+and than, even destroyed:
+
+	vcm_destroy(vcm_vmm);
+	vcm_destroy(vcm_iommu);
+	vcm_destroy(vcm_onetoone);
+
+* Available drivers
+
+The following VCM drivers are provided:
+
+** Real hardware drivers
+
+There are no real hardware drivers at this time.
+
+** One-to-One drivers
+
+As it has been noted, one-to-One drivers are limited in the sense that
+certain operations are very unlikely to succeed.  In fact, it is often
+certain that some operations will fail.  If your driver needs to be
+able to run with One-to-One driver you should limit operations to:
+
+	vcm_make_binding()
+	vcm_destroy_binding()
+
+under some conditions, vcm_map() may also work.
+
+There are no One-to-One drivers at this time.
+
+* Writing a VCM driver
+
+The core of VCMM does not handle communication with the MMU.  For this
+purpose a VCM driver is used.  Its purpose is to manage virtual
+address space reservations, physical allocations as well as updating
+mappings in the hardware MMU.
+
+API designed for VCM drivers is described in the
+[[file:../include/linux/vcm-drv.h][include/linux/vcm-drv.h]] file so it might be a good idea to take a look
+inside.
+
+VCMM provides API for three different kinds of drivers.  The most
+basic is a core VCM which VCMM use directly.  Other then that, VCMM
+provides two wrappers -- VCM MMU and VCM One-to-One -- which can be
+used to create drivers for real hardware VCM contexts and for
+One-to-One contexts.
+
+All of the drivers need to provide a context creation functions which
+will allocate memory, fill start address, size and pointer to driver
+operations, and then call an init function which fills rest of the
+fields and validates entered values.
+
+** Writing a core VCM driver
+
+The core driver needs to provide a context creation function as well
+as at least some of the following operations:
+
+	void (*cleanup)(struct vcm *vcm);
+
+	int (*alloc)(struct vcm *vcm, resource_size_t size,
+		     struct vcm_phys **phys, unsigned alloc_flags,
+		     struct vcm_res **res, unsigned res_flags);
+	struct vcm_res *(*res)(struct vcm *vcm, resource_size_t size,
+			       unsigned flags);
+	struct vcm_phys *(*phys)(struct vcm *vcm, resource_size_t size,
+				 unsigned flags);
+
+	void (*unreserve)(struct vcm_res *res);
+
+	struct vcm_res *(*map)(struct vcm *vcm, struct vcm_phys *phys,
+			       unsigned flags);
+	int (*bind)(struct vcm_res *res, struct vcm_phys *phys);
+	void (*unbind)(struct vcm_res *res);
+
+	int (*activate)(struct vcm *vcm);
+	void (*deactivate)(struct vcm *vcm);
+
+All of the operations (expect for the alloc) may assume that all
+pointer arguments are not-NULL.  (In case of alloc, if any argument is
+NULL it is either phys or res (never both).)
+
+*** Context creation
+
+To use a VCM driver a VCM context has to be provided which is bound to
+the driver.  This is done by a driver-dependent call defined in it's
+header file.  Such a call may take varyous arguments to configure the
+context of the MMU.  Its prototype may look as follows:
+
+	struct vcm *__must_check vcm_samp_create(/* ... */);
+
+The driver will most likely define a structure encapsulating the vcm
+structure (in the usual way).  The context creation function must
+allocate space for such a structure and initialise it correctly
+including all members of the vcm structure expect for activations.
+The activations member is initialised by calling:
+
+	struct vcm *__must_check vcm_init(struct vcm *vcm);
+
+This function also validates that all fields are set correctly.
+
+The driver field of the vcm structure must point to a structure with
+all operations supported by the driver.
+
+If everything succeeds, the function has to return pointer to the vcm
+structure inside the encapsulating structure.  It is the pointer that
+will be passed to all of the driver's operations.  On error,
+a pointer-error must be returned (ie. not NULL).
+
+The function might look something like the following:
+
+	struct vcm *__must_check vcm_foo_create(/* ... */)
+	{
+		struct vcm_foo *foo;
+		struct vcm *vcm;
+
+		foo = kzalloc(sizeof *foo, GFP_KERNEL);
+		if (!foo)
+			return ERR_PTR(-ENOMEM);
+
+		/* ... do stuff ... */
+
+		foo->vcm.start  = /* ... */;
+		foo->vcm.size   = /* ... */;
+		foo->vcm.driver = &vcm_foo_driver;
+
+		vcm = vcm_init(&foo->vcm);
+		if (IS_ERR(vcm)) {
+			/* ... error recovery ... */
+			kfree(foo);
+		}
+		return vcm;
+	}
+
+*** Cleaning up
+
+The cleanup operation is called when the VCM context is destroyed.
+Its purpose is to free all resources acquired when VCM context was
+created including the space for the context structure.  If it is not
+given, the memory is freed using the kfree() function.
+
+*** Allocation and reservations
+
+If alloc operation is specified, res and phys operations are ignored.
+The observable behaviour of the alloc operation should mimic as
+closely as possible res and phys operations called one after the
+other.
+
+The reason for this operation is that in case of one-to-one VCM
+contexts, the driver may not be able to bind together arbitrary
+reservation with an arbitrary physical space.  In one-to-one contexts,
+reservations and physical memory are tight together and need to be
+made at the same time to make binding possible.
+
+The alloc operation may be called with both, res and phys being set,
+or at most one of them being NULL.
+
+The res operation reserves virtual address space in the VCM context.
+The function must set the start and res_size members of the vcm_res
+structure -- all other fields are filled by the VCMM framework.
+
+The phys operation allocates physical space which can later be bound
+to the reservation.  Unless VCM driver needs some special handling of
+physical memory, the vcm_phys_alloc() function can be used:
+
+	struct vcm_phys *__must_check
+	vcm_phys_alloc(resource_size_t size, unsigned flags,
+		       const unsigned char *orders);
+
+The last argument of this function (orders) is an array of orders of
+page sizes that function should try to allocate.  This array must be
+sorted from highest order to lowest and the last entry must be zero.
+
+For instance, an array { 8, 4, 0 } means that the function should try
+and allocate 1MiB, 64KiB and 4KiB pages (this is assuming PAGE_SIZE is
+4KiB which is true for all supported architectures).  For example, if
+requested size is 2MiB and 68 KiB, the function will try to allocate
+two 1MiB pages, one 64KiB page and one 4KiB page.  This may be useful
+when the mapping is written to the MMU since the largest possible
+pages will be used reducing the number of entries.
+
+If phys or alloc callback chooses to allocate physical memory on its
+own, it must provide a free callback along with the vcm_phys
+structure.  The purpose of the callback is, as one may imagine, to
+free allocated space.
+
+All those operations may assume that size is a non-zero and divisible
+by PAGE_SIZE.
+
+*** Binding
+
+The map operation is optional and it joins res and bind operations
+together.  Like alloc operation, this is provided because in case of
+one-to-one mappings, the VCM driver may be unable to bind together
+physical space with an arbitrary reservation.
+
+Moreover, in case of some VCM drivers, a mapping for given physical
+memory can already be present (ie. in case of using VMM).
+
+Reservation created with map operation does not have to be usable
+with any other physical space then the one provided when reservation
+was created.
+
+The bind operation binds given reservation with a given physical
+memory.  The operation may assume that reservation given as an
+argument is not bound to any physical memory.
+
+Whichever of the two operation is used, the binding must be reflected
+on the hardware if the VCM context has been activated.  If VCM context
+has not been activated this is not required.
+
+The vcm_map() function uses map operation if one is provided.
+Otherwise, it falls back to alloc or res operation followed by bind
+operation.  If this is also not possible, -EOPNOTSUPP is returned.
+Similarly, vcm_bind() function uses the bind operation unless it is
+not provided in which case -EOPNOTSUPP is returned.
+
+Also, if alloc operation is not provided but map is, the
+vcm_make_binding() function will use phys and map operations.
+
+*** Freeing resources
+
+The unbind callback removes the binding between reservation and
+a physical memory.  If unbind operation is not provided, VCMM assumes
+that it is a no-operation.
+
+The unreserve callback releases a reservation as well as free
+allocated space for the vcm_res structure.  It is required and if it
+is not provided vcm_unreserve() will generate a warning.
+
+*** Activation
+
+When VCM context is activated, the activate callback is called.  It is
+called only once even if vcm_activate() is called several times on the
+same context.
+
+When VCM context is deactivated (that is, if for each call to
+vcm_activate(), vcm_deactivate() was called) the deactivate callback
+is called.
+
+When VCM context is activated, all bound reservations must be
+reflected on the hardware MMU (if any).  Also, ofter activation, all
+calls to vcm_bind(), vcm_map() or vcm_make_binding() must
+automatically reflect new mappings on the hardware MMU.
+
+Neither of the operations are required and if missing, VCMM will
+assume they are a no-operation and no warning will be generated.
+
+** Writing a hardware MMU driver
+
+It may be undesirable to implement all of the operations that are
+required to create a usable driver.  In case of hardware MMUs a helper
+wrapper driver has been created to make writing real drivers as simple
+as possible.
+
+The wrapper implements most of the functionality of the driver leaving
+only implementation of the actual talking to the hardware MMU in hands
+of programmer.  Reservations managements as general housekeeping is
+already there.
+
+If you want to use this wrapper, you need to select VCM_MMU Kconfig
+option.
+
+*** Context creation
+
+Similarly to normal drivers, MMU driver needs to provide a context
+creation function.  Such a function must provide a vcm_mmu object and
+initialise vcm.start, vcm.size and driver fields of the structure.
+When this is done, vcm_mmu_init() should be called which will
+initialise the rest of the fields and validate entered values:
+
+	struct vcm *__must_check vcm_mmu_init(struct vcm_mmu *mmu);
+
+This is, in fact, very similar to the way standard driver is created.
+
+*** Orders
+
+One of the fields of the vcm_mmu_driver structure is orders.  This is
+an array of orders of pages supported by the hardware MMU.  It must be
+sorted from largest to smallest and zero terminated.
+
+The order is the logarithm with the base two of the size of supported
+page size divided by PAGE_SIZE.  For instance, { 8, 4, 0 } means that
+MMU supports 1MiB, 64KiB and 4KiB pages.
+
+*** Operations
+
+The three operations that MMU wrapper driver uses are:
+
+	void (*cleanup)(struct vcm *vcm);
+
+	int (*activate)(struct vcm_res *res, struct vcm_phys *phys);
+	void (*deactivate)(struct vcm_res *res, struct vcm_phys *phys);
+
+	int (*activate_page)(dma_addr_t vaddr, dma_addr_t paddr,
+			     unsigned order, void *vcm),
+	int (*deactivate_page)(dma_addr_t vaddr, dma_addr_t paddr,
+			       unsigned order, void *vcm),
+
+The first one frees all resources allocated by the context creation
+function (including the structure itself).  If this operation is not
+given, kfree() will be called on vcm_mmu structure.
+
+The activate and deactivate operations are required and they are used
+to update mappings in the MMU.  Whenever binding is activated or
+deactivated the respective operation is called.
+
+To divide mapping into physical pages, vcm_phys_walk() function can be
+used:
+
+	int vcm_phys_walk(dma_addr_t vaddr, const struct vcm_phys *phys,
+			  const unsigned char *orders,
+			  int (*callback)(dma_addr_t vaddr, dma_addr_t paddr,
+					  unsigned order, void *priv),
+			  int (*recovery)(dma_addr_t vaddr, dma_addr_t paddr,
+					  unsigned order, void *priv),
+			  void *priv);
+
+It start from given virtual address and tries to divide allocated
+physical memory to as few pages as possible where order of each page
+is one of the orders specified by orders argument.
+
+It may be easier to implement activate_page and deactivate_page
+operations instead thought.  They are called on each individual page
+rather then the whole mapping.  It basically incorporates call to the
+vcm_phys_walk() function so driver does not need to call it
+explicitly.
+
+** Writing a one-to-one VCM driver
+
+
+
+Similarly to a wrapper for a real hardware MMU a wrapper for
+one-to-one VCM contexts has been created.  It implements all of the
+houskeeping operations and leaves only contiguous memory management
+(that is allocating and freeing contiguous regions).
+
+*** Context creation
+
+As with other drivers, one-to-one driver needs to provide a context
+creation function.  It needs to allocate space for vcm_o2o structure
+and initialise its vcm.start, vcm.end and driver fields.  Calling
+vcm_o2o_init() will fill the other fields and validate entered values:
+
+	struct vcm *__must_check vcm_o2o_init(struct vcm_o2o *o2o);
+
+There are the following three operations used by the wrapper:
+
+	void (*cleanup)(struct vcm *vcm);
+	void *(*alloc)(struct vcm *vcm, struct vcm_phys_part *part,
+		       unsigned flags);
+	void (*free)(struct vcm_phys_part *part, void *priv);
+
+The cleanup operation cleans the context and frees all resources.  If
+not provided, kfree() is used.
+
+The alloc operation allocates physically contiguous memory.  The size
+of requested block is saved in the provided vcm_phys_part structure.
+Inside this structure, the operation must save the physical address of
+the allocated block if allocation succeeds.  On error, the operation
+must return an error-pointer (NULL is treated as success).
+
+The free operation frees the physically contiguous memory that has
+been allocated with alloc.  As arguments, it is given the same
+vcm_phy_part structure that alloc initialised as well as a priv
+argument which is the value returned by alloc operation.
+
+Both, alloc and free, operations are required.
+
+If you want to use this wrapper, you need to select VCM_O2O Kconfig
+option.
+
+* Epilogue
+
+The initial version of the VCMM framework was written by Zach Pfeffer
+<zpfeffer@codeaurora.org>.  It was then redesigned and mostly
+rewritten by Michal Nazarewicz <m.nazarewicz@samsung.com>.
+
+The new version is still lacking a few important features.  Most
+notably, no real hardware MMU has been implemented yet.  This may be
+ported from original Zach's proposal.
+
+Also, support for VMM is lacking.  This is another thing that can be
+ported from Zach's proposal.
diff --git a/include/linux/vcm-drv.h b/include/linux/vcm-drv.h
new file mode 100644
index 0000000..8c1862e
--- /dev/null
+++ b/include/linux/vcm-drv.h
@@ -0,0 +1,299 @@
+/*
+ * Virtual Contiguous Memory driver API header
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz at samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#ifndef __LINUX_VCM_DRV_H
+#define __LINUX_VCM_DRV_H
+
+#include <linux/vcm.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
+
+#include <asm/atomic.h>
+
+/**
+ * struct vcm_driver - the MMU driver operations.
+ * @cleanup:	called when vcm objects is destroyed; if omitted,
+ *		kfree() will be used.
+ * @alloc:	callback function for allocating physical memory and
+ *		reserving virtual address space; XXX FIXME: document;
+ *		if set, @res and @alloc are ignored.
+ * @res:	creates a reservation of virtual address space; XXX FIXME:
+ *		document; if @alloc is provided this is ignored.
+ * @res:	allocates a physical memory; XXX FIXME: document; if @alloc
+ *		is provided this is ignored.
+ * @unreserve:	destroys a virtual address space reservation created by @alloc;
+ *		required.
+ * @map:	reserves address space and binds a physical memory to it.
+ * @bind:	binds a physical memory to a reserved address space.
+ * @unbind:	unbinds a physical memory from reserved address space.
+ * @activate:	activates the context making all bindings active; once
+ *		the context has been activated, this callback is not
+ *		called again until context is deactivated and
+ *		activated again (so if user calls vcm_activate()
+ *		several times only the first call in sequence will
+ *		invoke this callback).
+ * @deactivate:	deactivates the context making all bindings inactive;
+ *		call this callback always accompanies call to the
+ *		@activate callback.
+ */
+struct vcm_driver {
+	void (*cleanup)(struct vcm *vcm);
+
+	int (*alloc)(struct vcm *vcm, resource_size_t size,
+		     struct vcm_phys **phys, unsigned alloc_flags,
+		     struct vcm_res **res, unsigned res_flags);
+	struct vcm_res *(*res)(struct vcm *vcm, resource_size_t size,
+			       unsigned flags);
+	struct vcm_phys *(*phys)(struct vcm *vcm, resource_size_t size,
+				 unsigned flags);
+
+	void (*unreserve)(struct vcm_res *res);
+
+	struct vcm_res *(*map)(struct vcm *vcm, struct vcm_phys *phys,
+			       unsigned flags);
+	int (*bind)(struct vcm_res *res, struct vcm_phys *phys);
+	void (*unbind)(struct vcm_res *res);
+
+	int (*activate)(struct vcm *vcm);
+	void (*deactivate)(struct vcm *vcm);
+};
+
+/**
+ * struct vcm_phys - representation of allocated physical memory.
+ * @count:	number of contiguous parts the memory consists of; if this
+ *		equals one the whole memory block is physically contiguous;
+ *		read only.
+ * @size:	total size of the allocated memory; read only.
+ * @free:	callback function called when memory is freed; internal.
+ * @bindings:	how many virtual address space reservations this memory has
+ *		been bound to; internal.
+ * @parts:	array of @count parts describing each physically contiguous
+ *		memory block that the whole area consists of; each element
+ *		describes part's physical starting address in bytes
+ *		(@parts->start) and its size in bytes (@parts->size); read
+ *		only.
+ */
+struct vcm_phys {
+	unsigned		count;
+	resource_size_t		size;
+
+	void (*free)(struct vcm_phys *phys);
+	atomic_t		bindings;
+
+	struct vcm_phys_part {
+		dma_addr_t	start;
+		resource_size_t	size;
+	} parts[0];
+};
+
+/**
+ * vcm_init() - initialises VCM context structure.
+ * @vcm:	the VCM context to initialise.
+ *
+ * This function initialises the vcm structure created by a MMU driver
+ * when setting things up.  It sets up all fields of the vcm structure
+ * expect for @vcm->start, @vcm->size and @vcm->driver which are
+ * validated by this function.  If they have invalid value function
+ * produces warning and returns an error-pointer.  If everything is
+ * fine, @vcm is returned.
+ */
+struct vcm *__must_check vcm_init(struct vcm *vcm);
+
+#ifdef CONFIG_VCM_MMU
+
+struct vcm_mmu;
+
+/**
+ * struct vcm_mmu_driver - a driver used for real MMUs.
+ * @orders:	array of orders of pages supported by the MMU sorted from
+ *		the largest to the smallest.  The last element is always
+ *		zero (which means 4K page).
+ * @cleanup:	Function called when the VCM context is destroyed;
+ *		optional, if not provided, kfree() is used.
+ * @activate:	callback function for activating a single mapping; it's
+ *		role is to set up the MMU so that reserved address space
+ *		donated by res will point to physical memory donated by
+ *		phys; required unless @activate_page and @deactivate_page
+ *		are both provided
+ * @deactivate:	this reverses the effect of @activate; required unless
+ *		@deactivate_page is provided.
+ * @activate_page:	callback function for activating a single page; it is
+ *			ignored if @activate is provided; it's given a single
+ *			page such that its order (given as third argument) is
+ *			one of the supported orders specified in @orders;
+ *			required unless @activate is provided.
+ * @deactivate_page:	this reverses the effect of the @activate_page
+ *			callback; required unless @activate and @deactivate
+ *			are both provided.
+ */
+struct vcm_mmu_driver {
+	const unsigned char	*orders;
+
+	void (*cleanup)(struct vcm *vcm);
+	int (*activate)(struct vcm_res *res, struct vcm_phys *phys);
+	void (*deactivate)(struct vcm_res *res, struct vcm_phys *phys);
+	int (*activate_page)(dma_addr_t vaddr, dma_addr_t paddr,
+			     unsigned order, void *vcm);
+	int (*deactivate_page)(dma_addr_t vaddr, dma_addr_t paddr,
+			       unsigned order, void *vcm);
+};
+
+/**
+ * struct vcm_mmu - VCM MMU context
+ * @vcm:	VCM context.
+ * @driver:	VCM MMU driver's operations.
+ * @pool:	virtual address space allocator; internal.
+ * @bound_res:	list of bound reservations; internal.
+ * @mutex:	mutext protecting @bound_res; internal.
+ * @activated:	whether VCM context has been activated; internal.
+ */
+struct vcm_mmu {
+	struct vcm			vcm;
+	const struct vcm_mmu_driver	*driver;
+	/* internal */
+	struct gen_pool			*pool;
+	struct list_head		bound_res;
+	/* The mutex protects operations on bound_res list and list. */
+	struct mutex			mutex;
+	int				activated;
+};
+
+/**
+ * vcm_mmu_init() - initialises a VCM context for a real MMU.
+ * @mmu:	the vcm_mmu context to initialise.
+ *
+ * This function initialises the vcm_mmu structure created by a MMU
+ * driver when setting things up.  It sets up all fields of the
+ * structure expect for @mmu->vcm.start, @mmu.vcm->size and
+ * @mmu->driver which are validated by this function.  If they have
+ * invalid value function produces warning and returns an
+ * error-pointer.  On any other error, an error-pointer is returned as
+ * well.  If everything is fine, address of @mmu->vcm is returned.
+ */
+struct vcm *__must_check vcm_mmu_init(struct vcm_mmu *mmu);
+
+#endif
+
+#ifdef CONFIG_VCM_O2O
+
+/**
+ * struct vcm_o2o_driver - VCM One-to-One driver
+ * @cleanup:	cleans up the VCM context; if not specified. kfree() is used.
+ * @alloc:	physically contiguous memory allocator; the size of the
+ *		block to allocate is specified by part->size; the physical
+ *		address of the block must be returned in part->start;
+ *		on error must return an error-pointer, otherwise some
+ *		other pointer which will be passed to @free as priv;
+ *		required.
+ * @free:	physical memory freeing function; required.
+ */
+struct vcm_o2o_driver {
+	void (*cleanup)(struct vcm *vcm);
+	void *(*alloc)(struct vcm *vcm, struct vcm_phys_part *part,
+		       unsigned flags);
+	void (*free)(struct vcm_phys_part *part, void *priv);
+};
+
+/**
+ * struct vcm_o2o - VCM One-to-One context
+ * @vcm:	VCM context.
+ * @driver:	VCM One-to-One driver's operations.
+ */
+struct vcm_o2o {
+	struct vcm			vcm;
+	const struct vcm_o2o_driver	*driver;
+};
+
+/**
+ * vcm_mmu_init() - initialises a VCM context for a one-to-one context.
+ * @o2o:	the vcm_o2o context to initialise.
+ *
+ * This function initialises the vcm_o2o structure created by a O2O
+ * driver when setting things up.  It sets up all fields of the
+ * structure expect for @o2o->vcm.start, @o2o->vcm.size and
+ * @o2o->driver which are validated by this function.  If they have
+ * invalid value function produces warning and returns an
+ * error-pointer.  On any other error, an error-pointer is returned as
+ * well.  If everything is fine, address of @o2o->vcm is returned.
+ */
+struct vcm *__must_check vcm_o2o_init(struct vcm_o2o *o2o);
+
+#endif
+
+#ifdef CONFIG_VCM_PHYS
+
+/**
+ * vcm_phys_alloc() - allocates physical discontiguous space
+ * @size:	size of the block to allocate.
+ * @flags:	additional allocation flags; XXX FIXME: document
+ * @orders:	array of orders of pages supported by the MMU sorted from
+ *		the largest to the smallest.  The last element is always
+ *		zero (which means 4K page).
+ *
+ * This function tries to allocate a physical discontiguous space in
+ * such a way that it allocates the largest possible blocks from the
+ * sizes donated by the @orders array.  So if @orders is { 8, 0 }
+ * (which means 1MiB and 4KiB pages are to be used) and requested
+ * @size is 2MiB and 12KiB the function will try to allocate two 1MiB
+ * pages and three 4KiB pages (in that order).  If big page cannot be
+ * allocated the function will still try to allocate more smaller
+ * pages.
+ */
+struct vcm_phys *__must_check
+vcm_phys_alloc(resource_size_t size, unsigned flags,
+	       const unsigned char *orders);
+
+/**
+ * vcm_phys_walk() - helper function for mapping physical pages
+ * @vaddr:	virtual address to map/unmap physical space to/from
+ * @phys:	physical space
+ * @orders:	array of orders of pages supported by the MMU sorted from
+ *		the largest to the smallest.  The last element is always
+ *		zero (which means 4K page).
+ * @callback:	function called for each page.
+ * @recover:	function called for each page when @callback returns
+ *		negative number; if it also returns negative number
+ *		function terminates; may be NULL.
+ * @priv:	private data for the callbacks.
+ *
+ * This function walks through @phys trying to mach largest possible
+ * page size donated by @orders.  For each such page @callback is
+ * called.  If @callback returns negative number the function calls
+ * @recover for each page @callback was called successfully.
+ *
+ * So, for instance, if we have a physical memory which consist of
+ * 1Mib part and 8KiB part and @orders is { 8, 0 } (which means 1MiB
+ * and 4KiB pages are to be used), @callback will be called first with
+ * 1MiB page and then two times with 4KiB page.  This is of course
+ * provided that @vaddr has correct alignment.
+ *
+ * The idea is for hardware MMU drivers to call this function and
+ * provide a callbacks for mapping/unmapping a single page.  The
+ * function divides the region into pages that the MMU can handle.
+ *
+ * If @callback at one point returns a negative number this is the
+ * return value of the function; otherwise zero is returned.
+ */
+int vcm_phys_walk(dma_addr_t vaddr, const struct vcm_phys *phys,
+		  const unsigned char *orders,
+		  int (*callback)(dma_addr_t vaddr, dma_addr_t paddr,
+				  unsigned order, void *priv),
+		  int (*recovery)(dma_addr_t vaddr, dma_addr_t paddr,
+				  unsigned order, void *priv),
+		  void *priv);
+
+#endif
+
+#endif
diff --git a/include/linux/vcm.h b/include/linux/vcm.h
new file mode 100644
index 0000000..965dc9b
--- /dev/null
+++ b/include/linux/vcm.h
@@ -0,0 +1,275 @@
+/*
+ * Virtual Contiguous Memory header
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz at samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#ifndef __LINUX_VCM_H
+#define __LINUX_VCM_H
+
+#include <linux/kref.h>
+#include <linux/compiler.h>
+
+struct vcm_driver;
+struct vcm_phys;
+
+/**
+ * struct vcm - A virtually contiguous memory context.
+ * @start:	the smallest possible address available in this context.
+ * @size:	size of available address space in bytes; internal, read
+ *		only for MMU drivers.
+ * @activations:	How many times context was activated; internal,
+ *			read only for MMU drivers.
+ * @driver:	driver handling this driver; internal.
+ *
+ * This structure represents a context of virtually contiguous memory
+ * managed by a MMU pointed by the @mmu pointer.  This is the main
+ * structure used to interact with the VCM framework.
+ *
+ * Whenever driver wants to reserve virtual address space or allocate
+ * backing storage this pointer to this structure must be passed.
+ *
+ */
+struct vcm {
+	dma_addr_t		start;
+	resource_size_t		size;
+	atomic_t		activations;
+	const struct vcm_driver	*driver;
+};
+
+/**
+ * struct vcm_res - A reserved virtually contiguous address space.
+ * @start:	bus address of the region in bytes; read only.
+ * @bound_size:	number of bytes actually bound to the virtual address;
+ *		read only.
+ * @res_size:	size of the reserved address space in bytes; read only.
+ * @vcm:	VCM context; internal, read only for MMU drivers.
+ * @phys:	pointer to physical memory bound to this reservation; NULL
+ *		if no physical memory is bound; read only.
+ *
+ * This structure represents a portion virtually contiguous address
+ * space reserved for use with the driver.  Once address space is
+ * reserved a physical memory can be bound to it so that it will paint
+ * to real memory.
+ */
+struct vcm_res {
+	dma_addr_t		start;
+	resource_size_t		bound_size;
+	resource_size_t		res_size;
+
+	struct vcm		*vcm;
+	struct vcm_phys		*phys;
+};
+
+
+/**
+ * vcm_destroy() - destroys a VCM context.
+ * @vcm:	VCM to destroy.
+ */
+void vcm_destroy(struct vcm *vcm);
+
+/**
+ * vcm_make_binding() - allocates memory and binds it to virtual address space
+ * @vcm:	VCM context to reserve virtual address space in
+ * @size:	number of bytes to allocate; aligned up to a PAGE_SIZE
+ * @alloc_flags:	additional allocator flags; see vcm_alloc() for
+ *			description of those.
+ * @res_flags:	additional reservation flags; see vcm_reserve() for
+ *		description of those.
+ *
+ * This is a call that binds together three other calls:
+ * vcm_reserve(), vcm_alloc() and vcm_bind().  The purpose of this
+ * function is that on systems with no IO MMU separate calls to
+ * vcm_alloc() and vcm_reserve() may fail whereas when called together
+ * they may work correctly.
+ *
+ * This is a consequence of the fact that with no IO MMU the simulated
+ * virtual address must be the same as physical address, thus if first
+ * virtual address space were to be reserved and then physical memory
+ * allocated, both addresses may not match.
+ *
+ * With this call, a driver that simulates IO MMU may simply allocate
+ * a physical memory and when this succeeds create correct reservation.
+ *
+ * In short, if device drivers do not need more advanced MMU
+ * functionolities, they should limit themselves to this function
+ * since then the drivers may be easily ported to systems without IO
+ * MMU.
+ *
+ * To access the vcm_phys structure created by this call a phys field
+ * of returned vcm_res structure should be used.
+ *
+ * On error returns a pointer which yields true when tested with
+ * IS_ERR().
+ */
+struct vcm_res  *__must_check
+vcm_make_binding(struct vcm *vcm, resource_size_t size,
+		 unsigned alloc_flags, unsigned res_flags);
+
+/**
+ * vcm_map() - makes a reservation and binds physical memory to it
+ * @vcm:	VCM context
+ * @phys:	physical memory to bind.
+ * @flags:	additional flags; see vcm_reserve() for	description of
+ *		those.
+ *
+ * This is a call that binds together two other calls: vcm_reserve()
+ * and vcm_bind().  If all you need is reserve address space and
+ * bind physical memory it's better to use this call since it may
+ * create better mappings in some situations.
+ *
+ * Drivers may be optimised in such a way that it won't be possible to
+ * use reservation with a different physical memory.
+ *
+ * On error returns a pointer which yields true when tested with
+ * IS_ERR().
+ */
+struct vcm_res *__must_check
+vcm_map(struct vcm *vcm, struct vcm_phys *phys, unsigned flags);
+
+/**
+ * vcm_alloc() - allocates a physical memory for use with vcm_res.
+ * @vcm:	VCM context allocation is performed in.
+ * @size:	number of bytes to allocate; aligned up to a PAGE_SIZE
+ * @flags:	additional allocator flags; XXX FIXME: describe
+ *
+ * In case of some MMU drivers, the @vcm may be important and later
+ * binding (vcm_bind()) may fail if done on another @vcm.
+ *
+ * On success returns a vcm_phys structure representing an allocated
+ * physical memory that can be bound to reserved virtual address
+ * space.  On error returns a pointer which yields true when tested with
+ * IS_ERR().
+ */
+struct vcm_phys *__must_check
+vcm_alloc(struct vcm *vcm, resource_size_t size, unsigned flags);
+
+/**
+ * vcm_free() - frees an allocated physical memory
+ * @phys:	physical memory to free.
+ *
+ * If the physical memory is bound to any reserved address space it
+ * must be unbound first.  Otherwise a warning will be issued and
+ * the memory won't be freed causing memory leaks.
+ */
+void vcm_free(struct vcm_phys *phys);
+
+/**
+ * vcm_reserve() - reserves a portion of virtual address space.
+ * @vcm:	VCM context reservation is performed in.
+ * @size:	number of bytes to allocate; aligned up to a PAGE_SIZE
+ * @flags:	additional reservation flags; XXX FIXME: describe
+ * @alignment:	required alignment of the reserved space; must be
+ *		a power of two or zero.
+ *
+ * On success returns a vcm_res structure representing a reserved
+ * (contiguous) virtual address space that physical memory can be
+ * bound to (using vcm_bind()).  On error returns a pointer which
+ * yields true when tested with IS_ERR().
+ */
+struct vcm_res *__must_check
+vcm_reserve(struct vcm *vcm, resource_size_t size, unsigned flags);
+
+/**
+ * vcm_unreserve() - destroyers a virtual address space reservation
+ * @res:	reservation to destroy.
+ *
+ * If any physical memory is bound to the reserved address space it
+ * must be unbound first.  Otherwise it will be unbound and warning
+ * will be issued.
+ */
+void vcm_unreserve(struct vcm_res *res);
+
+/**
+ * vcm_bind() - binds a physical memory to virtual address space
+ * @res:	virtual address space to bind the physical memory.
+ * @phys:	physical memory to bind to the virtual addresses.
+ *
+ * The mapping won't be active unless vcm_activate() on the VCM @res
+ * was created in context of was called.
+ *
+ * If @phys is already bound to @res this function returns -EALREADY.
+ * If some other physical memory is bound to @res -EADDRINUSE is
+ * returned.  If size of the physical memory is larger then the
+ * virtual space -ENOSPC is returned.  In all other cases the physical
+ * memory is bound to the virtual address and on success zero is
+ * returned, on error a negative number.
+ */
+int  __must_check vcm_bind(struct vcm_res *res, struct vcm_phys *phys);
+
+/**
+ * vcm_unbind() - unbinds a physical memory from virtual address space
+ * @res:	virtual address space to unbind the physical memory from.
+ *
+ * This reverses the effect of the vcm_bind() function.  Function
+ * returns physical space that was bound to the reservation (or NULL
+ * if no space was bound in which case also a warning is issued).
+ */
+struct vcm_phys *vcm_unbind(struct vcm_res *res);
+
+/**
+ * vcm_destroy_binding() - destroys the binding
+ * @res:	a bound reserved address space to destroy.
+ *
+ * This function incorporates three functions: vcm_unbind(),
+ * vcm_free() and vcm_unreserve() (in that order) in one call.
+ */
+void vcm_destroy_binding(struct vcm_res *res);
+
+/**
+ * vcm_unmap() - unbinds physical memory and unreserves address space
+ * @res:	reservation to destroy
+ *
+ * This is a call that binds together two other calls: vcm_unbind()
+ * and vcm_unreserve().
+ */
+static inline void vcm_unmap(struct vcm_res *res)
+{
+	vcm_unbind(res);
+	vcm_unreserve(res);
+}
+
+/**
+ * vcm_activate() - activates bindings in VCM.
+ * @vcm:	VCM to activate bindings in.
+ *
+ * All of the bindings on the @vcm done before this function is called
+ * are inactive and do not take effect.  The call to this function
+ * guarantees that all bindings are sent to the hardware MMU (if any).
+ *
+ * After VCM is activated all bindings will be automatically updated
+ * on the hardware MMU, so there is no need to call this function
+ * after each vcm_bind()/vcm_unbind().
+ *
+ * Each call to vcm_activate() should be later accompanied by a call
+ * to vcm_deactivate().  Otherwise a warning will be issued when VCM
+ * context is destroyed (vcm_destroy()).  This function can be called
+ * several times.
+ *
+ * On success returns zero, on error a negative error code.
+ */
+int  __must_check vcm_activate(struct vcm *vcm);
+
+/**
+ * vcm_deactivate() - deactivates bindings in VCM.
+ * @vcm:	VCM to deactivate bindings in.
+ *
+ * This function reverts effect of the vcm_activate() function.  After
+ * calling this function caller has no guarantee that bindings defined
+ * in VCM are active.
+ *
+ * If this is called without calling the vcm_activate() warning is
+ * issued.
+ */
+void vcm_deactivate(struct vcm *vcm);
+
+#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index b410910..0445f68 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -371,3 +371,33 @@ config CMA_BEST_FIT
 	  the number of allocated regions and usually much smaller).  It
 	  allocates area from the smallest hole that is big enough for
 	  allocation in question.
+
+
+config VCM
+	bool "Virtual Contiguous Memory framework"
+	help
+	  This enables the Virtual Contiguous Memory framework which
+	  provides an abstraction for virtual address space provided by
+	  various MMUs present on the platform.
+
+	  The framework uses plugable MMU drivers for hardware MMUs and
+	  if drivers obeys some limitations it can be also used on
+	  platforms with no MMU.
+
+	  For more information see
+	  <Documentation/virtual-contiguous-memory.txt>.  If unsure, say
+	  "n".
+
+# Select it if you need vcm_mmu wrapper driver
+config VCM_MMU
+	select VCM_PHYS
+	select GENERIC_ALLOCATOR
+	bool
+
+# Select if you need vcm_o2o wrapper driver
+config VCM_O2O
+	bool
+
+# Select if you need vcm_phys_alloc() or vcm_phys_walk() functions
+config VCM_PHYS
+	bool
diff --git a/mm/Makefile b/mm/Makefile
index d8c717f..e908202 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -49,3 +49,4 @@ obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
 obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
 obj-$(CONFIG_CMA) += cma.o
 obj-$(CONFIG_CMA_BEST_FIT) += cma-best-fit.o
+obj-$(CONFIG_VCM) += vcm.o
diff --git a/mm/vcm.c b/mm/vcm.c
new file mode 100644
index 0000000..ef3d1a6
--- /dev/null
+++ b/mm/vcm.c
@@ -0,0 +1,932 @@
+/*
+ * Virtual Contiguous Memory core
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz at samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#include <linux/vcm-drv.h>
+#include <linux/module.h>
+#include <linux/mm.h>
+#include <linux/err.h>
+#include <linux/slab.h>
+#include <linux/genalloc.h>
+
+#include <asm/atomic.h>
+
+/******************************** Devices API *******************************/
+
+void vcm_destroy(struct vcm *vcm)
+{
+	if (WARN_ON(atomic_read(&vcm->activations)))
+		vcm->driver->deactivate(vcm);
+
+	if (vcm->driver->cleanup)
+		vcm->driver->cleanup(vcm);
+	else
+		kfree(vcm);
+}
+EXPORT_SYMBOL_GPL(vcm_destroy);
+
+static void
+__vcm_alloc_and_reserve(struct vcm *vcm, resource_size_t size,
+			struct vcm_phys **phys, unsigned alloc_flags,
+			struct vcm_res **res, unsigned res_flags)
+{
+	int ret, alloc;
+
+	if (WARN_ON(!vcm) || !size) {
+		ret = -EINVAL;
+		goto error;
+	}
+
+	size = PAGE_ALIGN(size);
+
+	if (vcm->driver->alloc) {
+		ret = vcm->driver->alloc(vcm, size,
+					 phys, alloc_flags, res, res_flags);
+		if (ret)
+			goto error;
+		alloc = 1;
+	} else if ((res && !vcm->driver->res) || (phys && !vcm->driver->phys)) {
+		ret = -EOPNOTSUPP;
+		goto error;
+	}
+
+	if (res) {
+		if (!alloc) {
+			*res = vcm->driver->res(vcm, size, res_flags);
+			if (IS_ERR(*res)) {
+				ret = PTR_ERR(*res);
+				goto error;
+			}
+		}
+		(*res)->bound_size = 0;
+		(*res)->vcm = vcm;
+		(*res)->phys = NULL;
+	}
+
+	if (phys) {
+		if (!alloc) {
+			*phys = vcm->driver->phys(vcm, size, alloc_flags);
+			if (IS_ERR(*phys)) {
+				vcm_unreserve(*res);
+				ret = PTR_ERR(*phys);
+				goto error;
+			}
+		}
+		atomic_set(&(*phys)->bindings, 0);
+		WARN_ON(!(*phys)->free);
+	}
+
+	return;
+
+error:
+	if (phys)
+		*phys = ERR_PTR(ret);
+	if (res)
+		*res = ERR_PTR(ret);
+}
+
+struct vcm_res *__must_check
+vcm_make_binding(struct vcm *vcm, resource_size_t size,
+		 unsigned alloc_flags, unsigned res_flags)
+{
+	struct vcm_phys *phys;
+	struct vcm_res *res;
+
+	if (WARN_ON(!vcm || !size || (size & (PAGE_SIZE - 1))))
+		return ERR_PTR(-EINVAL);
+	else if (vcm->driver->alloc || !vcm->driver->map) {
+		int ret;
+
+		__vcm_alloc_and_reserve(vcm, size, &phys, alloc_flags,
+					&res, res_flags);
+
+		if (IS_ERR(res))
+			return res;
+
+		ret = vcm_bind(res, phys);
+		if (!ret)
+			return res;
+
+		if (vcm->driver->unreserve)
+			vcm->driver->unreserve(res);
+		phys->free(phys);
+		return ERR_PTR(ret);
+	} else {
+		__vcm_alloc_and_reserve(vcm, size, &phys, alloc_flags,
+					NULL, 0);
+
+		if (IS_ERR(phys))
+			return ERR_CAST(res);
+
+		res = vcm->driver->map(vcm, phys, res_flags);
+		if (IS_ERR(res))
+			phys->free(phys);
+		return res;
+	}
+}
+EXPORT_SYMBOL_GPL(vcm_make_binding);
+
+struct vcm_phys *__must_check
+vcm_alloc(struct vcm *vcm, resource_size_t size, unsigned flags)
+{
+	struct vcm_phys *phys;
+
+	__vcm_alloc_and_reserve(vcm, size, &phys, flags, NULL, 0);
+
+	return phys;
+}
+EXPORT_SYMBOL_GPL(vcm_alloc);
+
+struct vcm_res *__must_check
+vcm_reserve(struct vcm *vcm, resource_size_t size, unsigned flags)
+{
+	struct vcm_res *res;
+
+	__vcm_alloc_and_reserve(vcm, size, NULL, 0, &res, flags);
+
+	return res;
+}
+EXPORT_SYMBOL_GPL(vcm_reserve);
+
+struct vcm_res *__must_check
+vcm_map(struct vcm *vcm, struct vcm_phys *phys, unsigned flags)
+{
+	struct vcm_res *res;
+	int ret;
+
+	if (WARN_ON(!vcm))
+		return ERR_PTR(-EINVAL);
+
+	if (vcm->driver->map)
+		return vcm->driver->map(vcm, phys, flags);
+
+	res = vcm_reserve(vcm, phys->size, flags);
+	if (IS_ERR(res))
+		return res;
+
+	ret = vcm_bind(res, phys);
+	if (!ret)
+		return res;
+
+	vcm_unreserve(res);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(vcm_map);
+
+void vcm_unreserve(struct vcm_res *res)
+{
+	if (!WARN_ON(!res)) {
+		if (WARN_ON(res->phys))
+			vcm_unbind(res);
+		if (!WARN_ON_ONCE(!res->vcm->driver->unreserve))
+			res->vcm->driver->unreserve(res);
+	}
+}
+EXPORT_SYMBOL_GPL(vcm_unreserve);
+
+void vcm_free(struct vcm_phys *phys)
+{
+	if (!WARN_ON(!phys || atomic_read(&phys->bindings)) && phys->free)
+		phys->free(phys);
+}
+EXPORT_SYMBOL_GPL(vcm_free);
+
+int  __must_check vcm_bind(struct vcm_res *res, struct vcm_phys *phys)
+{
+	int ret;
+
+	if (WARN_ON(!res || !phys))
+		return -EINVAL;
+
+	if (res->phys == phys)
+		return -EALREADY;
+
+	if (res->phys)
+		return -EADDRINUSE;
+
+	if (phys->size > res->res_size)
+		return -ENOSPC;
+
+	if (res->vcm->driver->bind)
+		return -EOPNOTSUPP;
+
+	atomic_inc(&phys->bindings);
+	ret = res->vcm->driver->bind(res, phys);
+	if (ret)
+		res->bound_size = phys->size;
+	else
+		atomic_dec(&phys->bindings);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(vcm_bind);
+
+struct vcm_phys *vcm_unbind(struct vcm_res *res)
+{
+	struct vcm_phys *phys = NULL;
+	if (!WARN_ON(!res || !res->phys)) {
+		phys = res->phys;
+		if (res->vcm->driver->unbind)
+			res->vcm->driver->unbind(res);
+		atomic_dec(&phys->bindings);
+		res->phys = NULL;
+		res->bound_size = 0;
+	}
+	return phys;
+}
+EXPORT_SYMBOL_GPL(vcm_unbind);
+
+void vcm_destroy_binding(struct vcm_res *res)
+{
+	if (!WARN_ON(!res)) {
+		struct vcm_phys *phys = vcm_unbind(res);
+		if (phys)
+			vcm_free(phys);
+		vcm_unreserve(res);
+	}
+}
+EXPORT_SYMBOL_GPL(vcm_destroy_binding);
+
+int  __must_check vcm_activate(struct vcm *vcm)
+{
+	if (WARN_ON(!vcm))
+		return -EINVAL;
+	else if (atomic_inc_return(&vcm->activations) != 1
+	      || !vcm->driver->activate)
+		return 0;
+	else
+		return vcm->driver->activate(vcm);
+}
+EXPORT_SYMBOL_GPL(vcm_activate);
+
+void vcm_deactivate(struct vcm *vcm)
+{
+	if (!WARN_ON(!vcm || !atomic_read(&vcm->activations))
+	 && atomic_dec_and_test(&vcm->activations)
+	 && vcm->driver->deactivate)
+		vcm->driver->deactivate(vcm);
+}
+EXPORT_SYMBOL_GPL(vcm_deactivate);
+
+
+/****************************** VCM Drivers API *****************************/
+
+struct vcm *__must_check vcm_init(struct vcm *vcm)
+{
+	if (WARN_ON(!vcm || !vcm->size
+		 || ((vcm->start | vcm->size) & ~PAGE_MASK)
+		 || !vcm->driver || !vcm->driver->unreserve))
+		return ERR_PTR(-EINVAL);
+
+	atomic_set(&vcm->activations, 0);
+
+	return vcm;
+}
+EXPORT_SYMBOL_GPL(vcm_init);
+
+
+/*************************** Hardware MMU wrapper ***************************/
+
+#ifdef CONFIG_VCM_MMU
+
+struct vcm_mmu_res {
+	struct vcm_res			res;
+	struct list_head		bound;
+};
+
+static void vcm_mmu_cleanup(struct vcm *vcm)
+{
+	struct vcm_mmu *mmu = container_of(vcm, struct vcm_mmu, vcm);
+	WARN_ON(mutex_is_locked(&mmu->mutex));
+	gen_pool_destroy(mmu->pool);
+	if (mmu->driver->cleanup)
+		mmu->driver->cleanup(vcm);
+	else
+		kfree(mmu);
+}
+
+static struct vcm_res *
+vcm_mmu_res(struct vcm *vcm, resource_size_t size, unsigned flags)
+{
+	struct vcm_mmu *mmu = container_of(vcm, struct vcm_mmu, vcm);
+	resource_size_t s, alignment;
+	struct vcm_mmu_res *res;
+	const unsigned char *orders;
+	dma_addr_t addr;
+
+	res = kzalloc(sizeof *res, GFP_KERNEL);
+	if (!res)
+		return ERR_PTR(-ENOMEM);
+
+	/*
+	 * Use the largest alignment that makes sense for given
+	 * reservation size.  For instance, if MMU supports 1M pages
+	 * and reservation is 1M it would be nice to be able to have
+	 * reservation aligned to 1M so that if the physical memory
+	 * will consist of a single 1M block (aligned to 1M) a single
+	 * map entry will suffice.
+	 */
+	s = size >> PAGE_SHIFT;
+	for (orders = mmu->driver->orders; !(s >> *orders); ++orders)
+		/* nop */;
+	alignment = (resource_size_t)1 << (*orders + PAGE_SHIFT);
+
+	/*
+	 * We are allocating a bit more so that if allocation is not
+	 * aligned we can shift inside allocated block to get
+	 * allocation we want.
+	 */
+	s = size + alignment - PAGE_SIZE;
+
+	mutex_lock(&mmu->mutex);
+
+	addr = gen_pool_alloc(mmu->pool, s);
+
+	if (!addr) {
+		kfree(res);
+		res = ERR_PTR(-ENOSPC);
+	} else if (alignment > PAGE_SIZE) {
+		/*
+		 * Align the reservation.  We can safely do this since
+		 * we have allocated more memory then we needed and we
+		 * can move reservation around.
+		 */
+		dma_addr_t start = ALIGN(addr, alignment);
+
+		/* Free unused memory. */
+		if (start != addr)
+			gen_pool_free(mmu->pool, addr, start - addr);
+		if (start + size != addr + s)
+			gen_pool_free(mmu->pool, start + size,
+				      addr + s - (start + size));
+		addr = start;
+	}
+
+	mutex_unlock(&mmu->mutex);
+
+	if (!IS_ERR(res)) {
+		INIT_LIST_HEAD(&res->bound);
+		res->res.start = addr;
+		res->res.res_size = size;
+	}
+
+	return &res->res;
+}
+
+static struct vcm_phys *
+vcm_mmu_phys(struct vcm *vcm, resource_size_t size, unsigned flags)
+{
+	return vcm_phys_alloc(size, flags,
+			      container_of(vcm, struct vcm_mmu,
+					   vcm)->driver->orders);
+}
+
+static int __must_check
+__vcm_mmu_activate(struct vcm_res *res, struct vcm_phys *phys)
+{
+	struct vcm_mmu *mmu = container_of(res->vcm, struct vcm_mmu, vcm);
+	if (mmu->driver->activate)
+		return mmu->driver->activate(res, phys);
+
+	return vcm_phys_walk(res->start, phys, mmu->driver->orders,
+			     mmu->driver->activate_page,
+			     mmu->driver->deactivate_page, res->vcm);
+}
+
+static void __vcm_mmu_deactivate(struct vcm_res *res, struct vcm_phys *phys)
+{
+	struct vcm_mmu *mmu = container_of(res->vcm, struct vcm_mmu, vcm);
+	if (mmu->driver->deactivate)
+		return mmu->driver->deactivate(res, phys);
+
+	vcm_phys_walk(res->start, phys, mmu->driver->orders,
+		      mmu->driver->deactivate_page, NULL, res->vcm);
+}
+
+static int vcm_mmu_bind(struct vcm_res *_res, struct vcm_phys *phys)
+{
+	struct vcm_mmu_res *res = container_of(_res, struct vcm_mmu_res, res);
+	struct vcm_mmu *mmu = container_of(_res->vcm, struct vcm_mmu, vcm);
+	int ret;
+
+	mutex_lock(&mmu->mutex);
+
+	if (mmu->activated) {
+		ret = __vcm_mmu_activate(_res, phys);
+		if (ret)
+			goto done;
+	}
+
+	list_add_tail(&res->bound, &mmu->bound_res);
+	ret = 0;
+
+done:
+	mutex_unlock(&mmu->mutex);
+
+	return ret;
+}
+
+static void vcm_mmu_unbind(struct vcm_res *_res)
+{
+	struct vcm_mmu_res *res = container_of(_res, struct vcm_mmu_res, res);
+	struct vcm_mmu *mmu = container_of(_res->vcm, struct vcm_mmu, vcm);
+
+	mutex_lock(&mmu->mutex);
+
+	if (mmu->activated)
+		__vcm_mmu_deactivate(_res, _res->phys);
+
+	list_del_init(&res->bound);
+
+	mutex_unlock(&mmu->mutex);
+}
+
+static void vcm_mmu_unreserve(struct vcm_res *res)
+{
+	struct vcm_mmu *mmu = container_of(res->vcm, struct vcm_mmu, vcm);
+	mutex_lock(&mmu->mutex);
+	gen_pool_free(mmu->pool, res->start, res->res_size);
+	mutex_unlock(&mmu->mutex);
+}
+
+static int vcm_mmu_activate(struct vcm *vcm)
+{
+	struct vcm_mmu *mmu = container_of(vcm, struct vcm_mmu, vcm);
+	struct vcm_mmu_res *r, *rr;
+	int ret;
+
+	mutex_lock(&mmu->mutex);
+
+	list_for_each_entry(r, &mmu->bound_res, bound) {
+		ret = __vcm_mmu_activate(&r->res, r->res.phys);
+		if (ret < 0)
+			continue;
+
+		list_for_each_entry(rr, &mmu->bound_res, bound) {
+			if (r == rr)
+				goto done;
+			__vcm_mmu_deactivate(&rr->res, rr->res.phys);
+		}
+	}
+
+	mmu->activated = 1;
+	ret = 0;
+
+done:
+	mutex_unlock(&mmu->mutex);
+
+	return ret;
+}
+
+static void vcm_mmu_deactivate(struct vcm *vcm)
+{
+	struct vcm_mmu *mmu = container_of(vcm, struct vcm_mmu, vcm);
+	struct vcm_mmu_res *r;
+
+	mutex_lock(&mmu->mutex);
+
+	mmu->activated = 0;
+
+	list_for_each_entry(r, &mmu->bound_res, bound)
+		mmu->driver->deactivate(&r->res, r->res.phys);
+
+	mutex_unlock(&mmu->mutex);
+}
+
+struct vcm *__must_check vcm_mmu_init(struct vcm_mmu *mmu)
+{
+	static const struct vcm_driver driver = {
+		.cleanup	= vcm_mmu_cleanup,
+		.res		= vcm_mmu_res,
+		.phys		= vcm_mmu_phys,
+		.bind		= vcm_mmu_bind,
+		.unbind		= vcm_mmu_unbind,
+		.unreserve	= vcm_mmu_unreserve,
+		.activate	= vcm_mmu_activate,
+		.deactivate	= vcm_mmu_deactivate,
+	};
+
+	struct vcm *vcm;
+	int ret;
+
+	if (WARN_ON(!mmu || !mmu->driver ||
+		    !(mmu->driver->activate ||
+		      (mmu->driver->activate_page &&
+		       mmu->driver->deactivate_page)) ||
+		    !(mmu->driver->deactivate ||
+		      mmu->driver->deactivate_page)))
+		return ERR_PTR(-EINVAL);
+
+	mmu->vcm.driver = &driver;
+	vcm = vcm_init(&mmu->vcm);
+	if (IS_ERR(vcm))
+		return vcm;
+
+	mmu->pool = gen_pool_create(PAGE_SHIFT, -1);
+	if (!mmu->pool)
+		return ERR_PTR(-ENOMEM);
+
+	ret = gen_pool_add(mmu->pool, mmu->vcm.start, mmu->vcm.size, -1);
+	if (ret) {
+		gen_pool_destroy(mmu->pool);
+		return ERR_PTR(ret);
+	}
+
+	vcm->driver     = &driver;
+	INIT_LIST_HEAD(&mmu->bound_res);
+	mutex_init(&mmu->mutex);
+
+	return &mmu->vcm;
+}
+EXPORT_SYMBOL_GPL(vcm_mmu_init);
+
+#endif
+
+/**************************** One-to-One wrapper ****************************/
+
+#ifdef CONFIG_VCM_O2O
+
+struct vcm_o2o_binding {
+	void			*priv;
+	unsigned long		dead[1];
+	struct vcm_res		res;
+	struct vcm_phys		phys;
+	/* vcm_phys is variable length, don't put anything at the end */
+};
+
+static void vcm_o2o_cleanup(struct vcm *vcm)
+{
+	struct vcm_o2o *o2o = container_of(vcm, struct vcm_o2o, vcm);
+	if (o2o->driver->cleanup)
+		o2o->driver->cleanup(vcm);
+	else
+		kfree(o2o);
+}
+
+static void vcm_o2o_free(struct vcm_phys *phys)
+{
+	struct vcm_o2o_binding *b =
+		container_of(phys, struct vcm_o2o_binding, phys);
+	struct vcm_o2o *o2o =
+		container_of(b->res.vcm, struct vcm_o2o, vcm);
+	o2o->driver->free(phys->parts, b->priv);
+	if (test_and_set_bit(0, b->dead))
+		kfree(b);
+}
+
+static void vcm_o2o_unreserve(struct vcm_res *res)
+{
+	struct vcm_o2o_binding *b =
+		container_of(res, struct vcm_o2o_binding, res);
+	if (test_and_set_bit(0, b->dead))
+		kfree(b);
+}
+
+static struct vcm_phys *
+vcm_o2o_phys(struct vcm *vcm, resource_size_t size, unsigned flags)
+{
+	struct vcm_o2o *o2o = container_of(vcm, struct vcm_o2o, vcm);
+	struct vcm_o2o_binding *b;
+	void *priv;
+
+	b = kmalloc(sizeof *b + sizeof *b->phys.parts, GFP_KERNEL);
+	if (!b)
+		return ERR_PTR(-ENOMEM);
+
+	b->phys.parts->start = 0;
+	b->phys.parts->size  = size;
+	priv = o2o->driver->alloc(vcm, b->phys.parts, flags);
+	if (IS_ERR(priv)) {
+		kfree(b);
+		return ERR_CAST(priv);
+	}
+
+	if (WARN_ON(!b->phys.parts->size ||
+		    (b->phys.parts->start | b->phys.parts->size)
+		  & ~PAGE_MASK)) {
+		o2o->driver->free(b->phys.parts, b->priv);
+		kfree(b);
+		return ERR_PTR(-EINVAL);
+	}
+
+	b->priv		= priv;
+	b->dead[0]	= ~0;
+	b->res.start	= b->phys.parts->start;
+	b->res.res_size	= b->phys.parts->size;
+	b->phys.size	= b->phys.parts->size;
+	b->phys.count	= 1;
+	b->phys.free	= vcm_o2o_free;
+
+	return &b->phys;
+}
+
+static struct vcm_res *
+vcm_o2o_map(struct vcm *vcm, struct vcm_phys *phys, unsigned flags)
+{
+	struct vcm_o2o_binding *b =
+		container_of(phys, struct vcm_o2o_binding, phys);
+
+	if (!test_and_clear_bit(0, b->dead))
+		return ERR_PTR(-EBUSY);
+
+	return &b->res;
+}
+
+static int vcm_o2o_bind(struct vcm_res *res, struct vcm_phys *phys)
+{
+	struct vcm_o2o_binding *b =
+		container_of(res, struct vcm_o2o_binding, res);
+
+	if (&b->phys != phys)
+		return -EOPNOTSUPP;
+
+	if (WARN_ON(test_bit(0, b->dead)))
+		return -EINVAL;
+
+	return 0;
+}
+
+struct vcm *__must_check vcm_o2o_init(struct vcm_o2o *o2o)
+{
+	static const struct vcm_driver driver = {
+		.cleanup	= vcm_o2o_cleanup,
+		.phys		= vcm_o2o_phys,
+		.map		= vcm_o2o_map,
+		.bind		= vcm_o2o_bind,
+		.unreserve	= vcm_o2o_unreserve,
+	};
+
+	if (WARN_ON(!o2o || !o2o->driver ||
+		    !o2o->driver->alloc || !o2o->driver->free))
+		return ERR_PTR(-EINVAL);
+
+	o2o->vcm.driver = &driver;
+	return vcm_init(&o2o->vcm);
+}
+EXPORT_SYMBOL_GPL(vcm_o2o_init);
+
+#endif
+
+/************************ Physical memory management ************************/
+
+#ifdef CONFIG_VCM_PHYS
+
+struct vcm_phys_list {
+	struct vcm_phys_list	*next;
+	unsigned		count;
+	struct vcm_phys_part	parts[31];
+};
+
+static struct vcm_phys_list *__must_check
+vcm_phys_alloc_list_order(struct vcm_phys_list *last, resource_size_t *pages,
+			  unsigned flags, unsigned order, unsigned *total)
+{
+	unsigned count;
+
+	count	= *pages >> order;
+
+	/* So, we need count order-order pages */
+	do {
+		struct page *p = alloc_pages(GFP_DMA, order);
+
+		if (!p)
+			/*
+			 * If allocation failed we may still
+			 * try to continua allocating smaller
+			 * pages.
+			 */
+			break;
+
+		if (last->count == ARRAY_SIZE(last->parts)) {
+			struct vcm_phys_list *l;
+			l = kmalloc(sizeof *l, GFP_KERNEL);
+			if (!l)
+				return NULL;
+
+			l->next = NULL;
+			l->count = 0;
+			last->next = l;
+			last = l;
+		}
+
+		last->parts[last->count].start =
+			page_to_pfn(p) << PAGE_SHIFT;
+		last->parts[last->count].size =
+			(resource_size_t)1 << (order + PAGE_SHIFT);
+		++last->count;
+		++*total;
+		*pages -= 1 << order;
+	} while (--count);
+
+	return last;
+}
+
+static unsigned __must_check
+vcm_phys_alloc_list(struct vcm_phys_list *first,
+		    resource_size_t size, unsigned flags,
+		    const unsigned char *orders)
+{
+	struct vcm_phys_list *last = first;
+	unsigned total_parts = 0;
+	resource_size_t pages;
+
+	/*
+	 * We are trying to allocate as large pages as possible but
+	 * not larger then pages that MMU driver that called us
+	 * supports (ie. the ones provided by page_sizes).  This makes
+	 * it possible to map the region using fewest possible number
+	 * of entries.
+	 */
+	pages = size >> PAGE_SHIFT;
+	do {
+		while (!(pages >> *orders))
+			++orders;
+
+		last = vcm_phys_alloc_list_order(last, &pages, flags, *orders,
+						 &total_parts);
+		if (!last)
+			return 0;
+
+	} while (*orders++ && pages);
+
+	if (pages)
+		return 0;
+
+	return total_parts;
+}
+
+static void vcm_phys_free_parts(struct vcm_phys_part *parts, unsigned count)
+{
+	do {
+		free_pages(parts->start, ffs(parts->size) - 1 - PAGE_SHIFT);
+		++parts;
+	} while (--count);
+}
+
+static void vcm_phys_alloc_cleanup(struct vcm_phys_list *lst)
+{
+	struct vcm_phys_list *first = lst;
+	do {
+		struct vcm_phys_list *l;
+
+		vcm_phys_free_parts(lst->parts, lst->count);
+
+		l = lst->next;
+		if (lst != first)
+			kfree(lst);
+		lst = l;
+	} while (lst);
+}
+
+static void vcm_phys_free(struct vcm_phys *phys)
+{
+	vcm_phys_free_parts(phys->parts, phys->count);
+}
+
+struct vcm_phys *__must_check
+vcm_phys_alloc(resource_size_t size, unsigned flags,
+	       const unsigned char *orders)
+{
+	struct vcm_phys_list first = { NULL, 0 }, *lst;
+	struct vcm_phys_part *out;
+	struct vcm_phys *phys;
+	unsigned count;
+
+	if (WARN_ON((size & (PAGE_SIZE - 1)) || !size || !orders))
+		return ERR_PTR(-EINVAL);
+
+	count = vcm_phys_alloc_list(&first, size, flags, orders);
+	if (!count)
+		goto error;
+
+	phys = kmalloc(sizeof *phys + count * sizeof *phys->parts, GFP_KERNEL);
+	if (!phys)
+		goto error;
+
+	phys->free = vcm_phys_free;
+	phys->count = count;
+	phys->size = size;
+
+	out = phys->parts;
+	lst = &first;
+	do {
+		struct vcm_phys_list *l;
+
+		memcpy(out, lst->parts, lst->count * sizeof *out);
+		out += lst->count;
+
+		l = lst->next;
+		if (lst != &first)
+			kfree(lst);
+		lst = l;
+	} while (lst);
+
+	return phys;
+
+error:
+	vcm_phys_alloc_cleanup(&first);
+	return ERR_PTR(-ENOMEM);
+}
+EXPORT_SYMBOL_GPL(vcm_phys_alloc);
+
+static inline bool is_of_order(dma_addr_t size, unsigned order)
+{
+	return !(size & (((dma_addr_t)PAGE_SIZE << order) - 1));
+}
+
+static int
+__vcm_phys_walk_part(dma_addr_t vaddr, const struct vcm_phys_part *part,
+		     const unsigned char *orders,
+		     int (*callback)(dma_addr_t vaddr, dma_addr_t paddr,
+				     unsigned order, void *priv), void *priv,
+		     unsigned *limit)
+{
+	resource_size_t size = part->size;
+	dma_addr_t paddr = part->start;
+	resource_size_t ps;
+
+	while (!is_of_order(vaddr, *orders))
+		++orders;
+	while (!is_of_order(paddr, *orders))
+		++orders;
+
+	ps = PAGE_SIZE << *orders;
+	for (; *limit && size; --*limit) {
+		int ret;
+
+		while (ps > size)
+			ps = PAGE_SIZE << *++orders;
+
+		ret = callback(vaddr, paddr, *orders, priv);
+		if (ret < 0)
+			return ret;
+
+		ps = PAGE_SIZE << *orders;
+		vaddr += ps;
+		paddr += ps;
+		size  -= ps;
+	}
+
+	return 0;
+}
+
+int vcm_phys_walk(dma_addr_t _vaddr, const struct vcm_phys *phys,
+		  const unsigned char *orders,
+		  int (*callback)(dma_addr_t vaddr, dma_addr_t paddr,
+				  unsigned order, void *arg),
+		  int (*recovery)(dma_addr_t vaddr, dma_addr_t paddr,
+				  unsigned order, void *arg),
+		  void *priv)
+{
+	unsigned limit = ~0;
+	int r = 0;
+
+	if (WARN_ON(!phys || ((_vaddr | phys->size) & (PAGE_SIZE - 1)) ||
+		    !phys->size || !orders || !callback))
+		return -EINVAL;
+
+	for (;;) {
+		const struct vcm_phys_part *part = phys->parts;
+		unsigned count = phys->count;
+		dma_addr_t vaddr = _vaddr;
+		int ret = 0;
+
+		for (; count && limit; --count, ++part) {
+			ret = __vcm_phys_walk_part(vaddr, part, orders,
+						   callback, priv, &limit);
+			if (ret)
+				break;
+
+			vaddr += part->size;
+		}
+
+		if (r)
+			/* We passed error recovery */
+			return r;
+
+		/*
+		 * Either operation suceeded or we were not provided
+		 * with a recovery callback -- return.
+		 */
+		if (!ret || !recovery)
+			return ret;
+
+		/* Switch to recovery */
+		limit = ~0 - limit;
+		callback = recovery;
+		r = ret;
+	}
+}
+EXPORT_SYMBOL_GPL(vcm_phys_walk);
+
+#endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 8/9] mm: vcm: Sample driver added
  2010-09-06  6:33 ` Michal Nazarewicz
  (?)
@ 2010-09-06  6:33   ` Michal Nazarewicz
  -1 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

This commit adds a sample Virtual Contiguous Memory framework
driver.  It handles no real hardware and is there only for
demonstrating purposes.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 Documentation/virtual-contiguous-memory.txt |    3 +
 include/linux/vcm-sample.h                  |   30 +++++++
 mm/Kconfig                                  |    9 ++
 mm/Makefile                                 |    1 +
 mm/vcm-sample.c                             |  120 +++++++++++++++++++++++++++
 5 files changed, 163 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/vcm-sample.h
 create mode 100644 mm/vcm-sample.c

diff --git a/Documentation/virtual-contiguous-memory.txt b/Documentation/virtual-contiguous-memory.txt
index 0c0e90c..6d1014c 100644
--- a/Documentation/virtual-contiguous-memory.txt
+++ b/Documentation/virtual-contiguous-memory.txt
@@ -730,6 +730,9 @@ already there.
 If you want to use this wrapper, you need to select VCM_MMU Kconfig
 option.
 
+There is a sample driver provided which provides a template for real
+drivers.  It can be found in [[file:../mm/vcm-sample.c][mm/vcm-sample.c]] file.
+
 *** Context creation
 
 Similarly to normal drivers, MMU driver needs to provide a context
diff --git a/include/linux/vcm-sample.h b/include/linux/vcm-sample.h
new file mode 100644
index 0000000..9a79403
--- /dev/null
+++ b/include/linux/vcm-sample.h
@@ -0,0 +1,30 @@
+/*
+ * Virtual Contiguous Memory driver driver template header
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#ifndef __LINUX_VCM_SAMPLE_H
+#define __LINUX_VCM_SAMPLE_H
+
+#include <linux/vcm.h>
+
+struct vcm;
+
+/**
+ * vcm_samp_create() - creates a VCM context
+ *
+ * ... Documentation goes here ...
+ */
+struct vcm *__must_check vcm_samp_create(/* ... */);
+
+#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index 0445f68..be040e7 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -401,3 +401,12 @@ config VCM_O2O
 # Select if you need vcm_phys_alloc() or vcm_phys_walk() functions
 config VCM_PHYS
 	bool
+
+config VCM_SAMP
+	bool "VCM sample driver"
+	depends on VCM
+	select VCM_MMU
+	help
+	  This enables a sample driver for the VCM framework.  This driver
+	  does not handle any real harwdare.  It's merely an template of
+	  how for real drivers.
diff --git a/mm/Makefile b/mm/Makefile
index e908202..c465dfa 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -50,3 +50,4 @@ obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
 obj-$(CONFIG_CMA) += cma.o
 obj-$(CONFIG_CMA_BEST_FIT) += cma-best-fit.o
 obj-$(CONFIG_VCM) += vcm.o
+obj-$(CONFIG_VCM_SAMPLE) += vcm-sample.o
diff --git a/mm/vcm-sample.c b/mm/vcm-sample.c
new file mode 100644
index 0000000..e265a73
--- /dev/null
+++ b/mm/vcm-sample.c
@@ -0,0 +1,120 @@
+/*
+ * Virtual Contiguous Memory driver template
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * This is just a sample code.  It does nothing useful other then
+ * presenting a template for VCM driver.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#include <linux/vcm-drv.h>
+#include <linux/vcm-sample.h>
+
+struct vcm_samp {
+	struct vcm_mmu	mmu;
+	/* ... */
+};
+
+static const unsigned vcm_samp_orders[] = {
+	4 + 20 - PAGES_SHIFT,	/* 16MiB pages */
+	0 + 20 - PAGES_SHIFT,	/*  1MiB pages */
+	6 + 10 - PAGES_SHIFT,	/* 64KiB pages */
+	2 + 10 - PAGES_SHIFT,	/*  4KiB pages */
+};
+
+static int vcm_samp_activate_page(dma_addr_t vaddr, dma_addr_t paddr,
+				  unsigned order, void *priv)
+{
+	struct vcm_samp *samp =
+		container_of((struct vcm *)priv, struct vcm_samp, mmu.vcm);
+
+	/*
+	 * Handle adding a mapping from virtual page at @vaddr to
+	 * physical page ad @paddr.  The page is of order @order which
+	 * means that it's (PAGE_SIZE << @order) bytes.
+	 */
+
+	return -EOPNOTSUPP;
+}
+
+static int vcm_samp_deactivate_page(dma_addr_t vaddr, dma_addr_t paddr,
+				    unsigned order, void *priv)
+{
+	struct vcm_samp *samp =
+		container_of((struct vcm *)priv, struct vcm_samp, mmu.vcm);
+
+	/*
+	 * Handle removing a mapping from virtual page at @vaddr to
+	 * physical page ad @paddr.  The page is of order @order which
+	 * means that it's (PAGE_SIZE << @order) bytes.
+	 */
+
+	/* It's best not to fail here */
+	return 0;
+}
+
+static void vcm_samp_cleanup(struct vcm *vcm)
+{
+	struct vcm_samp *samp =
+		container_of(res->vcm, struct vcm_samp, mmu.vcm);
+
+	/* Clean ups ... */
+
+	kfree(samp);
+}
+
+struct vcm *__must_check vcm_samp_create(/* ... */)
+{
+	static const struct vcm_mmu_driver driver = {
+		.order           = vcm_samp_orders,
+		.cleanup         = vcm_samp_cleanup,
+		.activate_page   = vcm_samp_activate_page,
+		.deactivate_page = vcm_samp_deactivate_page,
+	};
+
+	struct vcm_samp *samp;
+	struct vcm *vcm;
+
+	switch (0) {
+	case 0:
+	case PAGE_SHIFT == 12:
+		/*
+		 * If you have a compilation error here it means you
+		 * are compiling for a very strange platfrom where
+		 * PAGE_SHIFT is not 12 (ie. PAGE_SIZE is not 4KiB).
+		 * This driver assumes PAGE_SHIFT is 12.
+		 */
+	};
+
+	samp = kzalloc(sizeof *samp, GFP_KERNEL);
+	if (!samp)
+		return ERR_PTR(-ENOMEM);
+
+	/* ... Set things up ... */
+
+	samp->mmu.driver    = &driver;
+	/* skip first 64K so that zero address will be a NULL pointer */
+	samp->mmu.vcm.start =  (64 << 10);
+	samp->mmu.vcm.size  = -(64 << 10);
+
+	vcm = vcm_mmu_init(&samp->mmu);
+	if (!IS_ERR(vcm))
+		return vcm;
+
+	/* ... Error recovery ... */
+
+	kfree(samp);
+	return vcm;
+}
+EXPORT_SYMBOL_GPL(vcm_samp_create);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 8/9] mm: vcm: Sample driver added
@ 2010-09-06  6:33   ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

This commit adds a sample Virtual Contiguous Memory framework
driver.  It handles no real hardware and is there only for
demonstrating purposes.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 Documentation/virtual-contiguous-memory.txt |    3 +
 include/linux/vcm-sample.h                  |   30 +++++++
 mm/Kconfig                                  |    9 ++
 mm/Makefile                                 |    1 +
 mm/vcm-sample.c                             |  120 +++++++++++++++++++++++++++
 5 files changed, 163 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/vcm-sample.h
 create mode 100644 mm/vcm-sample.c

diff --git a/Documentation/virtual-contiguous-memory.txt b/Documentation/virtual-contiguous-memory.txt
index 0c0e90c..6d1014c 100644
--- a/Documentation/virtual-contiguous-memory.txt
+++ b/Documentation/virtual-contiguous-memory.txt
@@ -730,6 +730,9 @@ already there.
 If you want to use this wrapper, you need to select VCM_MMU Kconfig
 option.
 
+There is a sample driver provided which provides a template for real
+drivers.  It can be found in [[file:../mm/vcm-sample.c][mm/vcm-sample.c]] file.
+
 *** Context creation
 
 Similarly to normal drivers, MMU driver needs to provide a context
diff --git a/include/linux/vcm-sample.h b/include/linux/vcm-sample.h
new file mode 100644
index 0000000..9a79403
--- /dev/null
+++ b/include/linux/vcm-sample.h
@@ -0,0 +1,30 @@
+/*
+ * Virtual Contiguous Memory driver driver template header
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#ifndef __LINUX_VCM_SAMPLE_H
+#define __LINUX_VCM_SAMPLE_H
+
+#include <linux/vcm.h>
+
+struct vcm;
+
+/**
+ * vcm_samp_create() - creates a VCM context
+ *
+ * ... Documentation goes here ...
+ */
+struct vcm *__must_check vcm_samp_create(/* ... */);
+
+#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index 0445f68..be040e7 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -401,3 +401,12 @@ config VCM_O2O
 # Select if you need vcm_phys_alloc() or vcm_phys_walk() functions
 config VCM_PHYS
 	bool
+
+config VCM_SAMP
+	bool "VCM sample driver"
+	depends on VCM
+	select VCM_MMU
+	help
+	  This enables a sample driver for the VCM framework.  This driver
+	  does not handle any real harwdare.  It's merely an template of
+	  how for real drivers.
diff --git a/mm/Makefile b/mm/Makefile
index e908202..c465dfa 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -50,3 +50,4 @@ obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
 obj-$(CONFIG_CMA) += cma.o
 obj-$(CONFIG_CMA_BEST_FIT) += cma-best-fit.o
 obj-$(CONFIG_VCM) += vcm.o
+obj-$(CONFIG_VCM_SAMPLE) += vcm-sample.o
diff --git a/mm/vcm-sample.c b/mm/vcm-sample.c
new file mode 100644
index 0000000..e265a73
--- /dev/null
+++ b/mm/vcm-sample.c
@@ -0,0 +1,120 @@
+/*
+ * Virtual Contiguous Memory driver template
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * This is just a sample code.  It does nothing useful other then
+ * presenting a template for VCM driver.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#include <linux/vcm-drv.h>
+#include <linux/vcm-sample.h>
+
+struct vcm_samp {
+	struct vcm_mmu	mmu;
+	/* ... */
+};
+
+static const unsigned vcm_samp_orders[] = {
+	4 + 20 - PAGES_SHIFT,	/* 16MiB pages */
+	0 + 20 - PAGES_SHIFT,	/*  1MiB pages */
+	6 + 10 - PAGES_SHIFT,	/* 64KiB pages */
+	2 + 10 - PAGES_SHIFT,	/*  4KiB pages */
+};
+
+static int vcm_samp_activate_page(dma_addr_t vaddr, dma_addr_t paddr,
+				  unsigned order, void *priv)
+{
+	struct vcm_samp *samp =
+		container_of((struct vcm *)priv, struct vcm_samp, mmu.vcm);
+
+	/*
+	 * Handle adding a mapping from virtual page at @vaddr to
+	 * physical page ad @paddr.  The page is of order @order which
+	 * means that it's (PAGE_SIZE << @order) bytes.
+	 */
+
+	return -EOPNOTSUPP;
+}
+
+static int vcm_samp_deactivate_page(dma_addr_t vaddr, dma_addr_t paddr,
+				    unsigned order, void *priv)
+{
+	struct vcm_samp *samp =
+		container_of((struct vcm *)priv, struct vcm_samp, mmu.vcm);
+
+	/*
+	 * Handle removing a mapping from virtual page at @vaddr to
+	 * physical page ad @paddr.  The page is of order @order which
+	 * means that it's (PAGE_SIZE << @order) bytes.
+	 */
+
+	/* It's best not to fail here */
+	return 0;
+}
+
+static void vcm_samp_cleanup(struct vcm *vcm)
+{
+	struct vcm_samp *samp =
+		container_of(res->vcm, struct vcm_samp, mmu.vcm);
+
+	/* Clean ups ... */
+
+	kfree(samp);
+}
+
+struct vcm *__must_check vcm_samp_create(/* ... */)
+{
+	static const struct vcm_mmu_driver driver = {
+		.order           = vcm_samp_orders,
+		.cleanup         = vcm_samp_cleanup,
+		.activate_page   = vcm_samp_activate_page,
+		.deactivate_page = vcm_samp_deactivate_page,
+	};
+
+	struct vcm_samp *samp;
+	struct vcm *vcm;
+
+	switch (0) {
+	case 0:
+	case PAGE_SHIFT == 12:
+		/*
+		 * If you have a compilation error here it means you
+		 * are compiling for a very strange platfrom where
+		 * PAGE_SHIFT is not 12 (ie. PAGE_SIZE is not 4KiB).
+		 * This driver assumes PAGE_SHIFT is 12.
+		 */
+	};
+
+	samp = kzalloc(sizeof *samp, GFP_KERNEL);
+	if (!samp)
+		return ERR_PTR(-ENOMEM);
+
+	/* ... Set things up ... */
+
+	samp->mmu.driver    = &driver;
+	/* skip first 64K so that zero address will be a NULL pointer */
+	samp->mmu.vcm.start =  (64 << 10);
+	samp->mmu.vcm.size  = -(64 << 10);
+
+	vcm = vcm_mmu_init(&samp->mmu);
+	if (!IS_ERR(vcm))
+		return vcm;
+
+	/* ... Error recovery ... */
+
+	kfree(samp);
+	return vcm;
+}
+EXPORT_SYMBOL_GPL(vcm_samp_create);
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 8/9] mm: vcm: Sample driver added
@ 2010-09-06  6:33   ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel

This commit adds a sample Virtual Contiguous Memory framework
driver.  It handles no real hardware and is there only for
demonstrating purposes.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 Documentation/virtual-contiguous-memory.txt |    3 +
 include/linux/vcm-sample.h                  |   30 +++++++
 mm/Kconfig                                  |    9 ++
 mm/Makefile                                 |    1 +
 mm/vcm-sample.c                             |  120 +++++++++++++++++++++++++++
 5 files changed, 163 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/vcm-sample.h
 create mode 100644 mm/vcm-sample.c

diff --git a/Documentation/virtual-contiguous-memory.txt b/Documentation/virtual-contiguous-memory.txt
index 0c0e90c..6d1014c 100644
--- a/Documentation/virtual-contiguous-memory.txt
+++ b/Documentation/virtual-contiguous-memory.txt
@@ -730,6 +730,9 @@ already there.
 If you want to use this wrapper, you need to select VCM_MMU Kconfig
 option.
 
+There is a sample driver provided which provides a template for real
+drivers.  It can be found in [[file:../mm/vcm-sample.c][mm/vcm-sample.c]] file.
+
 *** Context creation
 
 Similarly to normal drivers, MMU driver needs to provide a context
diff --git a/include/linux/vcm-sample.h b/include/linux/vcm-sample.h
new file mode 100644
index 0000000..9a79403
--- /dev/null
+++ b/include/linux/vcm-sample.h
@@ -0,0 +1,30 @@
+/*
+ * Virtual Contiguous Memory driver driver template header
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz at samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#ifndef __LINUX_VCM_SAMPLE_H
+#define __LINUX_VCM_SAMPLE_H
+
+#include <linux/vcm.h>
+
+struct vcm;
+
+/**
+ * vcm_samp_create() - creates a VCM context
+ *
+ * ... Documentation goes here ...
+ */
+struct vcm *__must_check vcm_samp_create(/* ... */);
+
+#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index 0445f68..be040e7 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -401,3 +401,12 @@ config VCM_O2O
 # Select if you need vcm_phys_alloc() or vcm_phys_walk() functions
 config VCM_PHYS
 	bool
+
+config VCM_SAMP
+	bool "VCM sample driver"
+	depends on VCM
+	select VCM_MMU
+	help
+	  This enables a sample driver for the VCM framework.  This driver
+	  does not handle any real harwdare.  It's merely an template of
+	  how for real drivers.
diff --git a/mm/Makefile b/mm/Makefile
index e908202..c465dfa 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -50,3 +50,4 @@ obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
 obj-$(CONFIG_CMA) += cma.o
 obj-$(CONFIG_CMA_BEST_FIT) += cma-best-fit.o
 obj-$(CONFIG_VCM) += vcm.o
+obj-$(CONFIG_VCM_SAMPLE) += vcm-sample.o
diff --git a/mm/vcm-sample.c b/mm/vcm-sample.c
new file mode 100644
index 0000000..e265a73
--- /dev/null
+++ b/mm/vcm-sample.c
@@ -0,0 +1,120 @@
+/*
+ * Virtual Contiguous Memory driver template
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz at samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * This is just a sample code.  It does nothing useful other then
+ * presenting a template for VCM driver.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#include <linux/vcm-drv.h>
+#include <linux/vcm-sample.h>
+
+struct vcm_samp {
+	struct vcm_mmu	mmu;
+	/* ... */
+};
+
+static const unsigned vcm_samp_orders[] = {
+	4 + 20 - PAGES_SHIFT,	/* 16MiB pages */
+	0 + 20 - PAGES_SHIFT,	/*  1MiB pages */
+	6 + 10 - PAGES_SHIFT,	/* 64KiB pages */
+	2 + 10 - PAGES_SHIFT,	/*  4KiB pages */
+};
+
+static int vcm_samp_activate_page(dma_addr_t vaddr, dma_addr_t paddr,
+				  unsigned order, void *priv)
+{
+	struct vcm_samp *samp =
+		container_of((struct vcm *)priv, struct vcm_samp, mmu.vcm);
+
+	/*
+	 * Handle adding a mapping from virtual page at @vaddr to
+	 * physical page ad @paddr.  The page is of order @order which
+	 * means that it's (PAGE_SIZE << @order) bytes.
+	 */
+
+	return -EOPNOTSUPP;
+}
+
+static int vcm_samp_deactivate_page(dma_addr_t vaddr, dma_addr_t paddr,
+				    unsigned order, void *priv)
+{
+	struct vcm_samp *samp =
+		container_of((struct vcm *)priv, struct vcm_samp, mmu.vcm);
+
+	/*
+	 * Handle removing a mapping from virtual page@@vaddr to
+	 * physical page ad @paddr.  The page is of order @order which
+	 * means that it's (PAGE_SIZE << @order) bytes.
+	 */
+
+	/* It's best not to fail here */
+	return 0;
+}
+
+static void vcm_samp_cleanup(struct vcm *vcm)
+{
+	struct vcm_samp *samp =
+		container_of(res->vcm, struct vcm_samp, mmu.vcm);
+
+	/* Clean ups ... */
+
+	kfree(samp);
+}
+
+struct vcm *__must_check vcm_samp_create(/* ... */)
+{
+	static const struct vcm_mmu_driver driver = {
+		.order           = vcm_samp_orders,
+		.cleanup         = vcm_samp_cleanup,
+		.activate_page   = vcm_samp_activate_page,
+		.deactivate_page = vcm_samp_deactivate_page,
+	};
+
+	struct vcm_samp *samp;
+	struct vcm *vcm;
+
+	switch (0) {
+	case 0:
+	case PAGE_SHIFT == 12:
+		/*
+		 * If you have a compilation error here it means you
+		 * are compiling for a very strange platfrom where
+		 * PAGE_SHIFT is not 12 (ie. PAGE_SIZE is not 4KiB).
+		 * This driver assumes PAGE_SHIFT is 12.
+		 */
+	};
+
+	samp = kzalloc(sizeof *samp, GFP_KERNEL);
+	if (!samp)
+		return ERR_PTR(-ENOMEM);
+
+	/* ... Set things up ... */
+
+	samp->mmu.driver    = &driver;
+	/* skip first 64K so that zero address will be a NULL pointer */
+	samp->mmu.vcm.start =  (64 << 10);
+	samp->mmu.vcm.size  = -(64 << 10);
+
+	vcm = vcm_mmu_init(&samp->mmu);
+	if (!IS_ERR(vcm))
+		return vcm;
+
+	/* ... Error recovery ... */
+
+	kfree(samp);
+	return vcm;
+}
+EXPORT_SYMBOL_GPL(vcm_samp_create);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 9/9] mm: vcm: vcm-cma: VCM CMA driver added
  2010-09-06  6:33 ` Michal Nazarewicz
  (?)
@ 2010-09-06  6:33   ` Michal Nazarewicz
  -1 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

This commit adds a VCM driver that instead of using real
hardware MMU emulates one and uses CMA for allocating
contiguous memory chunks.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 Documentation/virtual-contiguous-memory.txt |   12 ++++-
 include/linux/vcm-cma.h                     |   38 ++++++++++++
 mm/Kconfig                                  |   14 +++++
 mm/Makefile                                 |    1 +
 mm/vcm-cma.c                                |   84 +++++++++++++++++++++++++++
 5 files changed, 148 insertions(+), 1 deletions(-)
 create mode 100644 include/linux/vcm-cma.h
 create mode 100644 mm/vcm-cma.c

diff --git a/Documentation/virtual-contiguous-memory.txt b/Documentation/virtual-contiguous-memory.txt
index 6d1014c..01e2e6c 100644
--- a/Documentation/virtual-contiguous-memory.txt
+++ b/Documentation/virtual-contiguous-memory.txt
@@ -496,7 +496,17 @@ able to run with One-to-One driver you should limit operations to:
 
 under some conditions, vcm_map() may also work.
 
-There are no One-to-One drivers at this time.
+*** VCM CMA
+
+VCM CMA driver is a One-to-One driver which uses CMA (see
+[[file:contiguous-memory.txt][contiguous-memory.txt]]) to allocate physically contiguous memory.  VCM
+CMA context is created by calling:
+
+	struct vcm *__must_check
+	vcm_cma_create(const char *regions, dma_addr_t alignment);
+
+Its first argument is the list of regions that CMA should try to
+allocate memory from.  The second argument is required alignment.
 
 * Writing a VCM driver
 
diff --git a/include/linux/vcm-cma.h b/include/linux/vcm-cma.h
new file mode 100644
index 0000000..bc06767
--- /dev/null
+++ b/include/linux/vcm-cma.h
@@ -0,0 +1,38 @@
+/*
+ * Virtual Contiguous Memory driver for CMA header
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#ifndef __LINUX_VCM_CMA_H
+#define __LINUX_VCM_CMA_H
+
+#include <linux/types.h>
+
+struct vcm;
+
+/**
+ * vcm_cma_create() - creates a VCM context that fakes a hardware MMU
+ * @regions:	list of CMA regions physical allocations should be done
+ *		from.
+ * @alignment:	required alignment of allocations.
+ *
+ * This creates VCM context that can be used on platforms with no
+ * hardware MMU or for devices that aro conected to the bus directly.
+ * Because it does not represent real MMU it has some limitations:
+ * basically, vcm_alloc(), vcm_reserve() and vcm_bind() are likely to
+ * fail so vcm_make_binding() should be used instead.
+ */
+struct vcm *__must_check
+vcm_cma_create(const char *regions, dma_addr_t alignment);
+
+#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index be040e7..bf0c7f6 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -410,3 +410,17 @@ config VCM_SAMP
 	  This enables a sample driver for the VCM framework.  This driver
 	  does not handle any real harwdare.  It's merely an template of
 	  how for real drivers.
+
+config VCM_CMA
+	bool "VCM CMA driver"
+	depends on VCM && CMA
+	select VCM_O2O
+	help
+	  This enables VCM driver that instead of using a real hardware
+	  MMU fakes one and uses a direct mapping.  It provides a subset
+	  of functionalities of a real MMU but if drivers limits their
+	  use of VCM to only supported operations they can work on
+	  both systems with and without MMU with no changes.
+
+	  For more information see
+	  <Documentation/virtual-contiguous-memory-cma.txt>.
diff --git a/mm/Makefile b/mm/Makefile
index c465dfa..e376eef 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -51,3 +51,4 @@ obj-$(CONFIG_CMA) += cma.o
 obj-$(CONFIG_CMA_BEST_FIT) += cma-best-fit.o
 obj-$(CONFIG_VCM) += vcm.o
 obj-$(CONFIG_VCM_SAMPLE) += vcm-sample.o
+obj-$(CONFIG_VCM_CMA) += vcm-cma.o
diff --git a/mm/vcm-cma.c b/mm/vcm-cma.c
new file mode 100644
index 0000000..177041a
--- /dev/null
+++ b/mm/vcm-cma.c
@@ -0,0 +1,84 @@
+/*
+ * Virtual Contiguous Memory driver for CMA
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#include <linux/vcm-cma.h>
+#include <linux/vcm-drv.h>
+#include <linux/cma.h>
+#include <linux/module.h>
+#include <linux/err.h>
+#include <linux/slab.h>
+
+struct vcm_cma {
+	struct vcm_o2o	o2o;
+	const char	*regions;
+	dma_addr_t	alignment;
+};
+
+static void *
+vcm_cma_alloc(struct vcm *vcm, struct vcm_phys_part *part, unsigned flags)
+{
+	struct vcm_cma *cma = container_of(vcm, struct vcm_cma, o2o.vcm);
+	dma_addr_t addr;
+
+	addr = cma_alloc_from(cma->regions, part->size, cma->alignment);
+	if (IS_ERR_VALUE(addr))
+		return ERR_PTR(addr);
+
+	part->start = addr;
+	return NULL;
+}
+
+static void vcm_cma_free(struct vcm_phys_part *part, void *priv)
+{
+	cma_free(part->start);
+}
+
+struct vcm *__must_check
+vcm_cma_create(const char *regions, dma_addr_t alignment)
+{
+	static const struct vcm_o2o_driver driver = {
+		.alloc	= vcm_cma_alloc,
+		.free	= vcm_cma_free,
+	};
+
+	struct cma_info info;
+	struct vcm_cma *cma;
+	struct vcm *vcm;
+	int ret;
+
+	if (alignment & (alignment - 1))
+		return ERR_PTR(-EINVAL);
+
+	ret = cma_info_about(&info, regions);
+	if (ret < 0)
+		return ERR_PTR(ret);
+	if (info.count == 0)
+		return ERR_PTR(-ENOENT);
+
+	cma = kmalloc(sizeof *cma, GFP_KERNEL);
+	if (!cma)
+		return ERR_PTR(-ENOMEM);
+
+	cma->o2o.driver    = &driver;
+	cma->o2o.vcm.start = info.lower_bound;
+	cma->o2o.vcm.size  = info.upper_bound - info.lower_bound;
+	cma->regions       = regions;
+	cma->alignment     = alignment;
+	vcm = vcm_o2o_init(&cma->o2o);
+	if (IS_ERR(vcm))
+		kfree(cma);
+	return vcm;
+}
+EXPORT_SYMBOL_GPL(vcm_cma_create);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 9/9] mm: vcm: vcm-cma: VCM CMA driver added
@ 2010-09-06  6:33   ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel, linux-media, linux-mm
  Cc: Andrew Morton, Daniel Walker, FUJITA Tomonori, Hans Verkuil,
	Jonathan Corbet, KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk,
	Kyungmin Park, Marek Szyprowski, Mel Gorman, Minchan Kim,
	Pawel Osciak, Peter Zijlstra, Russell King, Zach Pfeffer,
	linux-kernel

This commit adds a VCM driver that instead of using real
hardware MMU emulates one and uses CMA for allocating
contiguous memory chunks.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 Documentation/virtual-contiguous-memory.txt |   12 ++++-
 include/linux/vcm-cma.h                     |   38 ++++++++++++
 mm/Kconfig                                  |   14 +++++
 mm/Makefile                                 |    1 +
 mm/vcm-cma.c                                |   84 +++++++++++++++++++++++++++
 5 files changed, 148 insertions(+), 1 deletions(-)
 create mode 100644 include/linux/vcm-cma.h
 create mode 100644 mm/vcm-cma.c

diff --git a/Documentation/virtual-contiguous-memory.txt b/Documentation/virtual-contiguous-memory.txt
index 6d1014c..01e2e6c 100644
--- a/Documentation/virtual-contiguous-memory.txt
+++ b/Documentation/virtual-contiguous-memory.txt
@@ -496,7 +496,17 @@ able to run with One-to-One driver you should limit operations to:
 
 under some conditions, vcm_map() may also work.
 
-There are no One-to-One drivers at this time.
+*** VCM CMA
+
+VCM CMA driver is a One-to-One driver which uses CMA (see
+[[file:contiguous-memory.txt][contiguous-memory.txt]]) to allocate physically contiguous memory.  VCM
+CMA context is created by calling:
+
+	struct vcm *__must_check
+	vcm_cma_create(const char *regions, dma_addr_t alignment);
+
+Its first argument is the list of regions that CMA should try to
+allocate memory from.  The second argument is required alignment.
 
 * Writing a VCM driver
 
diff --git a/include/linux/vcm-cma.h b/include/linux/vcm-cma.h
new file mode 100644
index 0000000..bc06767
--- /dev/null
+++ b/include/linux/vcm-cma.h
@@ -0,0 +1,38 @@
+/*
+ * Virtual Contiguous Memory driver for CMA header
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#ifndef __LINUX_VCM_CMA_H
+#define __LINUX_VCM_CMA_H
+
+#include <linux/types.h>
+
+struct vcm;
+
+/**
+ * vcm_cma_create() - creates a VCM context that fakes a hardware MMU
+ * @regions:	list of CMA regions physical allocations should be done
+ *		from.
+ * @alignment:	required alignment of allocations.
+ *
+ * This creates VCM context that can be used on platforms with no
+ * hardware MMU or for devices that aro conected to the bus directly.
+ * Because it does not represent real MMU it has some limitations:
+ * basically, vcm_alloc(), vcm_reserve() and vcm_bind() are likely to
+ * fail so vcm_make_binding() should be used instead.
+ */
+struct vcm *__must_check
+vcm_cma_create(const char *regions, dma_addr_t alignment);
+
+#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index be040e7..bf0c7f6 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -410,3 +410,17 @@ config VCM_SAMP
 	  This enables a sample driver for the VCM framework.  This driver
 	  does not handle any real harwdare.  It's merely an template of
 	  how for real drivers.
+
+config VCM_CMA
+	bool "VCM CMA driver"
+	depends on VCM && CMA
+	select VCM_O2O
+	help
+	  This enables VCM driver that instead of using a real hardware
+	  MMU fakes one and uses a direct mapping.  It provides a subset
+	  of functionalities of a real MMU but if drivers limits their
+	  use of VCM to only supported operations they can work on
+	  both systems with and without MMU with no changes.
+
+	  For more information see
+	  <Documentation/virtual-contiguous-memory-cma.txt>.
diff --git a/mm/Makefile b/mm/Makefile
index c465dfa..e376eef 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -51,3 +51,4 @@ obj-$(CONFIG_CMA) += cma.o
 obj-$(CONFIG_CMA_BEST_FIT) += cma-best-fit.o
 obj-$(CONFIG_VCM) += vcm.o
 obj-$(CONFIG_VCM_SAMPLE) += vcm-sample.o
+obj-$(CONFIG_VCM_CMA) += vcm-cma.o
diff --git a/mm/vcm-cma.c b/mm/vcm-cma.c
new file mode 100644
index 0000000..177041a
--- /dev/null
+++ b/mm/vcm-cma.c
@@ -0,0 +1,84 @@
+/*
+ * Virtual Contiguous Memory driver for CMA
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#include <linux/vcm-cma.h>
+#include <linux/vcm-drv.h>
+#include <linux/cma.h>
+#include <linux/module.h>
+#include <linux/err.h>
+#include <linux/slab.h>
+
+struct vcm_cma {
+	struct vcm_o2o	o2o;
+	const char	*regions;
+	dma_addr_t	alignment;
+};
+
+static void *
+vcm_cma_alloc(struct vcm *vcm, struct vcm_phys_part *part, unsigned flags)
+{
+	struct vcm_cma *cma = container_of(vcm, struct vcm_cma, o2o.vcm);
+	dma_addr_t addr;
+
+	addr = cma_alloc_from(cma->regions, part->size, cma->alignment);
+	if (IS_ERR_VALUE(addr))
+		return ERR_PTR(addr);
+
+	part->start = addr;
+	return NULL;
+}
+
+static void vcm_cma_free(struct vcm_phys_part *part, void *priv)
+{
+	cma_free(part->start);
+}
+
+struct vcm *__must_check
+vcm_cma_create(const char *regions, dma_addr_t alignment)
+{
+	static const struct vcm_o2o_driver driver = {
+		.alloc	= vcm_cma_alloc,
+		.free	= vcm_cma_free,
+	};
+
+	struct cma_info info;
+	struct vcm_cma *cma;
+	struct vcm *vcm;
+	int ret;
+
+	if (alignment & (alignment - 1))
+		return ERR_PTR(-EINVAL);
+
+	ret = cma_info_about(&info, regions);
+	if (ret < 0)
+		return ERR_PTR(ret);
+	if (info.count == 0)
+		return ERR_PTR(-ENOENT);
+
+	cma = kmalloc(sizeof *cma, GFP_KERNEL);
+	if (!cma)
+		return ERR_PTR(-ENOMEM);
+
+	cma->o2o.driver    = &driver;
+	cma->o2o.vcm.start = info.lower_bound;
+	cma->o2o.vcm.size  = info.upper_bound - info.lower_bound;
+	cma->regions       = regions;
+	cma->alignment     = alignment;
+	vcm = vcm_o2o_init(&cma->o2o);
+	if (IS_ERR(vcm))
+		kfree(cma);
+	return vcm;
+}
+EXPORT_SYMBOL_GPL(vcm_cma_create);
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFCv5 9/9] mm: vcm: vcm-cma: VCM CMA driver added
@ 2010-09-06  6:33   ` Michal Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michal Nazarewicz @ 2010-09-06  6:33 UTC (permalink / raw)
  To: linux-arm-kernel

This commit adds a VCM driver that instead of using real
hardware MMU emulates one and uses CMA for allocating
contiguous memory chunks.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 Documentation/virtual-contiguous-memory.txt |   12 ++++-
 include/linux/vcm-cma.h                     |   38 ++++++++++++
 mm/Kconfig                                  |   14 +++++
 mm/Makefile                                 |    1 +
 mm/vcm-cma.c                                |   84 +++++++++++++++++++++++++++
 5 files changed, 148 insertions(+), 1 deletions(-)
 create mode 100644 include/linux/vcm-cma.h
 create mode 100644 mm/vcm-cma.c

diff --git a/Documentation/virtual-contiguous-memory.txt b/Documentation/virtual-contiguous-memory.txt
index 6d1014c..01e2e6c 100644
--- a/Documentation/virtual-contiguous-memory.txt
+++ b/Documentation/virtual-contiguous-memory.txt
@@ -496,7 +496,17 @@ able to run with One-to-One driver you should limit operations to:
 
 under some conditions, vcm_map() may also work.
 
-There are no One-to-One drivers at this time.
+*** VCM CMA
+
+VCM CMA driver is a One-to-One driver which uses CMA (see
+[[file:contiguous-memory.txt][contiguous-memory.txt]]) to allocate physically contiguous memory.  VCM
+CMA context is created by calling:
+
+	struct vcm *__must_check
+	vcm_cma_create(const char *regions, dma_addr_t alignment);
+
+Its first argument is the list of regions that CMA should try to
+allocate memory from.  The second argument is required alignment.
 
 * Writing a VCM driver
 
diff --git a/include/linux/vcm-cma.h b/include/linux/vcm-cma.h
new file mode 100644
index 0000000..bc06767
--- /dev/null
+++ b/include/linux/vcm-cma.h
@@ -0,0 +1,38 @@
+/*
+ * Virtual Contiguous Memory driver for CMA header
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz at samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#ifndef __LINUX_VCM_CMA_H
+#define __LINUX_VCM_CMA_H
+
+#include <linux/types.h>
+
+struct vcm;
+
+/**
+ * vcm_cma_create() - creates a VCM context that fakes a hardware MMU
+ * @regions:	list of CMA regions physical allocations should be done
+ *		from.
+ * @alignment:	required alignment of allocations.
+ *
+ * This creates VCM context that can be used on platforms with no
+ * hardware MMU or for devices that aro conected to the bus directly.
+ * Because it does not represent real MMU it has some limitations:
+ * basically, vcm_alloc(), vcm_reserve() and vcm_bind() are likely to
+ * fail so vcm_make_binding() should be used instead.
+ */
+struct vcm *__must_check
+vcm_cma_create(const char *regions, dma_addr_t alignment);
+
+#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index be040e7..bf0c7f6 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -410,3 +410,17 @@ config VCM_SAMP
 	  This enables a sample driver for the VCM framework.  This driver
 	  does not handle any real harwdare.  It's merely an template of
 	  how for real drivers.
+
+config VCM_CMA
+	bool "VCM CMA driver"
+	depends on VCM && CMA
+	select VCM_O2O
+	help
+	  This enables VCM driver that instead of using a real hardware
+	  MMU fakes one and uses a direct mapping.  It provides a subset
+	  of functionalities of a real MMU but if drivers limits their
+	  use of VCM to only supported operations they can work on
+	  both systems with and without MMU with no changes.
+
+	  For more information see
+	  <Documentation/virtual-contiguous-memory-cma.txt>.
diff --git a/mm/Makefile b/mm/Makefile
index c465dfa..e376eef 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -51,3 +51,4 @@ obj-$(CONFIG_CMA) += cma.o
 obj-$(CONFIG_CMA_BEST_FIT) += cma-best-fit.o
 obj-$(CONFIG_VCM) += vcm.o
 obj-$(CONFIG_VCM_SAMPLE) += vcm-sample.o
+obj-$(CONFIG_VCM_CMA) += vcm-cma.o
diff --git a/mm/vcm-cma.c b/mm/vcm-cma.c
new file mode 100644
index 0000000..177041a
--- /dev/null
+++ b/mm/vcm-cma.c
@@ -0,0 +1,84 @@
+/*
+ * Virtual Contiguous Memory driver for CMA
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz at samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/virtual-contiguous-memory.txt for details.
+ */
+
+#include <linux/vcm-cma.h>
+#include <linux/vcm-drv.h>
+#include <linux/cma.h>
+#include <linux/module.h>
+#include <linux/err.h>
+#include <linux/slab.h>
+
+struct vcm_cma {
+	struct vcm_o2o	o2o;
+	const char	*regions;
+	dma_addr_t	alignment;
+};
+
+static void *
+vcm_cma_alloc(struct vcm *vcm, struct vcm_phys_part *part, unsigned flags)
+{
+	struct vcm_cma *cma = container_of(vcm, struct vcm_cma, o2o.vcm);
+	dma_addr_t addr;
+
+	addr = cma_alloc_from(cma->regions, part->size, cma->alignment);
+	if (IS_ERR_VALUE(addr))
+		return ERR_PTR(addr);
+
+	part->start = addr;
+	return NULL;
+}
+
+static void vcm_cma_free(struct vcm_phys_part *part, void *priv)
+{
+	cma_free(part->start);
+}
+
+struct vcm *__must_check
+vcm_cma_create(const char *regions, dma_addr_t alignment)
+{
+	static const struct vcm_o2o_driver driver = {
+		.alloc	= vcm_cma_alloc,
+		.free	= vcm_cma_free,
+	};
+
+	struct cma_info info;
+	struct vcm_cma *cma;
+	struct vcm *vcm;
+	int ret;
+
+	if (alignment & (alignment - 1))
+		return ERR_PTR(-EINVAL);
+
+	ret = cma_info_about(&info, regions);
+	if (ret < 0)
+		return ERR_PTR(ret);
+	if (info.count == 0)
+		return ERR_PTR(-ENOENT);
+
+	cma = kmalloc(sizeof *cma, GFP_KERNEL);
+	if (!cma)
+		return ERR_PTR(-ENOMEM);
+
+	cma->o2o.driver    = &driver;
+	cma->o2o.vcm.start = info.lower_bound;
+	cma->o2o.vcm.size  = info.upper_bound - info.lower_bound;
+	cma->regions       = regions;
+	cma->alignment     = alignment;
+	vcm = vcm_o2o_init(&cma->o2o);
+	if (IS_ERR(vcm))
+		kfree(cma);
+	return vcm;
+}
+EXPORT_SYMBOL_GPL(vcm_cma_create);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [RFCv5 3/9] mm: cma: Added SysFS support
  2010-09-06  6:33   ` Michal Nazarewicz
  (?)
@ 2010-09-06 21:07     ` Greg KH
  -1 siblings, 0 replies; 59+ messages in thread
From: Greg KH @ 2010-09-06 21:07 UTC (permalink / raw)
  To: Michal Nazarewicz
  Cc: linux-arm-kernel, linux-media, linux-mm, Andrew Morton,
	Daniel Walker, FUJITA Tomonori, Hans Verkuil, Jonathan Corbet,
	KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk, Kyungmin Park,
	Marek Szyprowski, Mel Gorman, Minchan Kim, Pawel Osciak,
	Peter Zijlstra, Russell King, Zach Pfeffer, linux-kernel

On Mon, Sep 06, 2010 at 08:33:53AM +0200, Michal Nazarewicz wrote:
> +++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> @@ -0,0 +1,53 @@
> +What:		/sys/kernel/mm/contiguous/
> +Date:		August 2010
> +Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
> +Description:
> +		If CMA has been built with SysFS support,
> +		/sys/kernel/mm/contiguous/ contains a file called
> +		"map", a file called "allocators" and a directory
> +		called "regions".
> +
> +		The "map" file lets one change the CMA's map attribute
> +		at run-time.

How?

> +		The "allocators" file list all registered allocators.
> +		Allocators with no name are listed as a single minus
> +		sign.

So this returns more than one value?

> +		The "regions" directory list all reserved regions.

Same here?

> --- a/Documentation/contiguous-memory.txt
> +++ b/Documentation/contiguous-memory.txt
> @@ -256,6 +256,10 @@
>       iff it matched in previous pattern.  If the second part is
>       omitted it will mach any type of memory requested by device.
>  
> +     If SysFS support is enabled, this attribute is accessible via
> +     SysFS and can be changed at run-time by writing to
> +     /sys/kernel/mm/contiguous/map.

it is "sysfs" no wierd "SysFS" please.

> diff --git a/include/linux/cma.h b/include/linux/cma.h
> index f6f9cb5..d0f41f4 100644
> --- a/include/linux/cma.h
> +++ b/include/linux/cma.h
> @@ -18,6 +18,9 @@
>  #include <linux/rbtree.h>
>  #include <linux/list.h>
>  #include <linux/init.h>
> +#if defined CONFIG_CMA_SYSFS
> +#  include <linux/kobject.h>
> +#endif

Just always include it, no need to put a #ifdef around it, right?

>  
>  struct device;
> @@ -204,6 +207,10 @@ struct cma_region {
>  	unsigned users;
>  	struct list_head list;
>  
> +#if defined CONFIG_CMA_SYSFS
> +	struct kobject kobj;
> +#endif

Same here, no need for a #ifdef, right?

Especially as it will change your lifetime rules for this object, why
would you not want this to always be the case?

> +
>  	unsigned used:1;
>  	unsigned registered:1;
>  	unsigned reserved:1;
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 86043a3..8bed799 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -322,16 +322,36 @@ config CMA
>  	  For more information see <Documentation/contiguous-memory.txt>.
>  	  If unsure, say "n".
>  
> -config CMA_DEBUG
> -	bool "CMA debug messages (DEVELOPEMENT)"
> +config CMA_DEVELOPEMENT
> +	bool "Include CMA developement features"
>  	depends on CMA
>  	help
> +	  This lets you enable some developement features of the CMA
> +	  framework.  It does not add any code to the kernel.
> +
> +	  Those options are mostly usable during development and testing.
> +	  If unsure, say "n".
> +
> +config CMA_DEBUG
> +	bool "CMA debug messages"
> +	depends on CMA_DEVELOPEMENT
> +	help
>  	  Turns on debug messages in CMA.  This produces KERN_DEBUG
>  	  messages for every CMA call as well as various messages while
>  	  processing calls such as cma_alloc().  This option does not
>  	  affect warning and error messages.
>  
> -	  This is mostly used during development.  If unsure, say "n".
> +config CMA_SYSFS
> +	bool "CMA SysFS interface support"
> +	depends on CMA_DEVELOPEMENT
> +	help
> +	  Enable support for SysFS interface.  The interface is available
> +	  under /sys/kernel/mm/contiguous.  Each region and allocator is
> +	  represented there.
> +
> +	  For more information consult
> +	  <Documentation/contiguous-memory.txt> and
> +	  <Documentation/ABI/testing/sysfs-kernel-mm-contiguous> files.
>  
>  config CMA_BEST_FIT
>  	bool "CMA best-fit allocator"
> diff --git a/mm/cma.c b/mm/cma.c
> index 06d0d5a..955f08c 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -38,8 +38,8 @@
>  
>  
>  /*
> - * Protects cma_regions, cma_allocators, cma_map, cma_map_length, and
> - * cma_chunks_by_start.
> + * Protects cma_regions, cma_allocators, cma_map, cma_map_length,
> + * cma_kobj, cma_sysfs_regions and cma_chunks_by_start.
>   */
>  static DEFINE_MUTEX(cma_mutex);
>  
> @@ -143,7 +143,11 @@ int __init __must_check cma_early_region_register(struct cma_region *reg)
>  
>  /************************* Regions & Allocators *************************/
>  
> +static void __cma_sysfs_region_add(struct cma_region *reg);
> +
>  static int __cma_region_attach_alloc(struct cma_region *reg);
> +static void __maybe_unused __cma_region_detach_alloc(struct cma_region *reg);
> +
>  
>  /* List of all regions.  Named regions are kept before unnamed. */
>  static LIST_HEAD(cma_regions);
> @@ -226,6 +230,8 @@ int __must_check cma_region_register(struct cma_region *reg)
>  	else
>  		list_add_tail(&reg->list, &cma_regions);
>  
> +	__cma_sysfs_region_add(reg);
> +
>  done:
>  	mutex_unlock(&cma_mutex);
>  
> @@ -482,6 +488,329 @@ subsys_initcall(cma_init);
>  
>  
>  
> +/************************* SysFS *************************/
> +
> +#if defined CONFIG_CMA_SYSFS
> +
> +static struct kobject cma_sysfs_regions;

No static kobjects please, that means the code is broken.

> +static int cma_sysfs_regions_ready;
> +
> +
> +#define CMA_ATTR_INLINE(_type, _name)					\
> +	(&((struct cma_ ## _type ## _attribute){			\
> +		.attr	= {						\
> +			.name	= __stringify(_name),			\
> +			.mode	= 0644,					\
> +		},							\
> +		.show	= cma_sysfs_ ## _type ## _ ## _name ## _show,	\
> +		.store	= cma_sysfs_ ## _type ## _ ## _name ## _store,	\
> +	}).attr)
> +
> +#define CMA_ATTR_RO_INLINE(_type, _name)				\
> +	(&((struct cma_ ## _type ## _attribute){			\
> +		.attr	= {						\
> +			.name	= __stringify(_name),			\
> +			.mode	= 0444,					\
> +		},							\
> +		.show	= cma_sysfs_ ## _type ## _ ## _name ## _show,	\
> +	}).attr)
> +
> +
> +struct cma_root_attribute {
> +	struct attribute attr;
> +	ssize_t (*show)(char *buf);
> +	int (*store)(const char *buf);
> +};

Why can't you use a 'struct device' instead of using a "raw" kobject?

> +
> +static ssize_t cma_sysfs_root_map_show(char *page)
> +{
> +	ssize_t len;
> +
> +	len = cma_map_length;
> +	if (!len) {
> +		*page = 0;
> +		len = 0;
> +	} else {
> +		if (len > (size_t)PAGE_SIZE - 1)
> +			len = (size_t)PAGE_SIZE - 1;
> +		memcpy(page, cma_map, len);
> +		page[len++] = '\n';
> +	}
> +
> +	return len;
> +}

What are you returning here?  A "raw" buffer?  Why?  That's not what
sysfs is for.

> +
> +static int cma_sysfs_root_map_store(const char *page)
> +{
> +	ssize_t len = cma_map_validate(page);
> +	char *val = NULL;
> +
> +	if (len < 0)
> +		return len;
> +
> +	if (len) {
> +		val = kmemdup(page, len + 1, GFP_KERNEL);
> +		if (!val)
> +			return -ENOMEM;
> +		val[len] = '\0';
> +	}
> +
> +	kfree(cma_map);
> +	cma_map = val;
> +	cma_map_length = len;
> +
> +	return 0;
> +}

What is this doing?  It doesn't look like a "text" file to me, or am I
confused?

> +
> +static ssize_t cma_sysfs_root_allocators_show(char *page)
> +{
> +	struct cma_allocator *alloc;
> +	size_t left = PAGE_SIZE;
> +	char *ch = page;
> +
> +	cma_foreach_allocator(alloc) {
> +		ssize_t l = snprintf(ch, left, "%s ", alloc->name ?: "-");
> +		ch   += l;
> +		left -= l;
> +	}
> +
> +	if (ch != page)
> +		ch[-1] = '\n';
> +	return ch - page;
> +}

You are returning more than one value per sysfs file, why?

> +static ssize_t
> +cma_sysfs_root_show(struct kobject *kobj, struct attribute *attr, char *buf)
> +{
> +	struct cma_root_attribute *rattr =
> +		container_of(attr, struct cma_root_attribute, attr);
> +	ssize_t ret;
> +
> +	mutex_lock(&cma_mutex);
> +	ret = rattr->show(buf);

Are you sure you are not going to want to pass the attribute and kobject
to the show/store function?

> +	mutex_unlock(&cma_mutex);
> +
> +	return ret;
> +}
> +
> +static ssize_t
> +cma_sysfs_root_store(struct kobject *kobj, struct attribute *attr,
> +		       const char *buf, size_t count)
> +{
> +	struct cma_root_attribute *rattr =
> +		container_of(attr, struct cma_root_attribute, attr);
> +	int ret;
> +
> +	mutex_lock(&cma_mutex);
> +	ret = rattr->store(buf);
> +	mutex_unlock(&cma_mutex);
> +
> +	return ret < 0 ? ret : count;
> +}
> +
> +static struct kobj_type cma_sysfs_root_type = {
> +	.sysfs_ops	= &(const struct sysfs_ops){
> +		.show	= cma_sysfs_root_show,
> +		.store	= cma_sysfs_root_store,
> +	},
> +	.default_attrs	= (struct attribute * []) {
> +		CMA_ATTR_INLINE(root, map),
> +		CMA_ATTR_RO_INLINE(root, allocators),
> +		NULL
> +	},
> +};

This type is broken by design, and as per the documentation in the
kernel, I now get to publically ridicule you for doing so (as explained
in the documentation.)

Please go fix it up.

If you don't know why it is broken, then please, read the documentation
provided in the kernel, it's as if I write it for no one to later pay
attention to.  Why do I even bother...

> +
> +static int __init cma_sysfs_init(void)
> +{
> +	static struct kobject root;

Again, no static kobjects please.

> +	static struct kobj_type fake_type;
> +
> +	struct cma_region *reg;
> +	int ret;
> +
> +	/* Root */
> +	ret = kobject_init_and_add(&root, &cma_sysfs_root_type,
> +				   mm_kobj, "contiguous");
> +	if (unlikely(ret < 0)) {

Why "unlikely"?  Is this a fast-path or something?  Hint, I don't think
so...

> +		pr_err("init: unable to add root kobject: %d\n", ret);
> +		return ret;
> +	}
> +
> +	/* Regions */
> +	ret = kobject_init_and_add(&cma_sysfs_regions, &fake_type,
> +				   &root, "regions");
> +	if (unlikely(ret < 0)) {

Same thing here.

> +		pr_err("init: unable to add regions kobject: %d\n", ret);
> +		return ret;
> +	}
> +
> +	mutex_lock(&cma_mutex);
> +	cma_sysfs_regions_ready = 1;
> +	cma_foreach_region(reg)
> +		__cma_sysfs_region_add(reg);
> +	mutex_unlock(&cma_mutex);
> +
> +	return 0;
> +}
> +device_initcall(cma_sysfs_init);
> +
> +
> +
> +struct cma_region_attribute {
> +	struct attribute attr;
> +	ssize_t (*show)(struct cma_region *reg, char *buf);
> +	int (*store)(struct cma_region *reg, const char *buf);
> +};
> +
> +
> +static ssize_t cma_sysfs_region_name_show(struct cma_region *reg, char *page)
> +{
> +	return reg->name ? snprintf(page, PAGE_SIZE, "%s\n", reg->name) : 0;
> +}

Is a name field ever really going to be bigger than a page?

> +
> +static ssize_t cma_sysfs_region_start_show(struct cma_region *reg, char *page)
> +{
> +	return snprintf(page, PAGE_SIZE, "%p\n", (void *)reg->start);
> +}

Same for a start field.

> +
> +static ssize_t cma_sysfs_region_size_show(struct cma_region *reg, char *page)
> +{
> +	return snprintf(page, PAGE_SIZE, "%zu\n", reg->size);
> +}

Same for the size

> +
> +static ssize_t cma_sysfs_region_free_show(struct cma_region *reg, char *page)
> +{
> +	return snprintf(page, PAGE_SIZE, "%zu\n", reg->free_space);
> +}

Same here.

> +
> +static ssize_t cma_sysfs_region_users_show(struct cma_region *reg, char *page)
> +{
> +	return snprintf(page, PAGE_SIZE, "%u\n", reg->users);
> +}

And here

> +
> +static ssize_t cma_sysfs_region_alloc_show(struct cma_region *reg, char *page)
> +{
> +	if (reg->alloc)
> +		return snprintf(page, PAGE_SIZE, "%s\n",
> +				reg->alloc->name ?: "-");
> +	else if (reg->alloc_name)
> +		return snprintf(page, PAGE_SIZE, "[%s]\n", reg->alloc_name);
> +	else
> +		return 0;
> +}

Are all of these files in the Documentation/ABI directory?  I missed
that in your above description...


> +
> +static int
> +cma_sysfs_region_alloc_store(struct cma_region *reg, const char *page)
> +{
> +	char *s;
> +
> +	if (reg->alloc && reg->users)
> +		return -EBUSY;
> +
> +	if (!*page || *page == '\n') {
> +		s = NULL;
> +	} else {
> +		size_t len;
> +
> +		for (s = (char *)page; *++s && *s != '\n'; )
> +			/* nop */;
> +
> +		len = s - page;
> +		s = kmemdup(page, len + 1, GFP_KERNEL);
> +		if (!s)
> +			return -ENOMEM;
> +		s[len] = '\0';
> +	}
> +
> +	if (reg->alloc)
> +		__cma_region_detach_alloc(reg);
> +
> +	if (reg->free_alloc_name)
> +		kfree(reg->alloc_name);
> +
> +	reg->alloc_name = s;
> +	reg->free_alloc_name = !!s;
> +
> +	return 0;
> +}
> +
> +
> +static ssize_t
> +cma_sysfs_region_show(struct kobject *kobj, struct attribute *attr,
> +		      char *buf)
> +{
> +	struct cma_region *reg = container_of(kobj, struct cma_region, kobj);
> +	struct cma_region_attribute *rattr =
> +		container_of(attr, struct cma_region_attribute, attr);
> +	ssize_t ret;
> +
> +	mutex_lock(&cma_mutex);
> +	ret = rattr->show(reg, buf);
> +	mutex_unlock(&cma_mutex);
> +
> +	return ret;
> +}
> +
> +static int
> +cma_sysfs_region_store(struct kobject *kobj, struct attribute *attr,
> +		       const char *buf, size_t count)
> +{
> +	struct cma_region *reg = container_of(kobj, struct cma_region, kobj);
> +	struct cma_region_attribute *rattr =
> +		container_of(attr, struct cma_region_attribute, attr);
> +	int ret;
> +
> +	mutex_lock(&cma_mutex);
> +	ret = rattr->store(reg, buf);
> +	mutex_unlock(&cma_mutex);
> +
> +	return ret < 0 ? ret : count;
> +}
> +
> +static struct kobj_type cma_sysfs_region_type = {
> +	.sysfs_ops	= &(const struct sysfs_ops){
> +		.show	= cma_sysfs_region_show,
> +		.store	= cma_sysfs_region_store,
> +	},
> +	.default_attrs	= (struct attribute * []) {
> +		CMA_ATTR_RO_INLINE(region, name),
> +		CMA_ATTR_RO_INLINE(region, start),
> +		CMA_ATTR_RO_INLINE(region, size),
> +		CMA_ATTR_RO_INLINE(region, free),
> +		CMA_ATTR_RO_INLINE(region, users),
> +		CMA_ATTR_INLINE(region, alloc),
> +		NULL
> +	},
> +};

Again, broken by design, please fix.

> +
> +static void __cma_sysfs_region_add(struct cma_region *reg)
> +{
> +	int ret;
> +
> +	if (!cma_sysfs_regions_ready)
> +		return;
> +
> +	memset(&reg->kobj, 0, sizeof reg->kobj);
> +
> +	ret = kobject_init_and_add(&reg->kobj, &cma_sysfs_region_type,
> +				   &cma_sysfs_regions,
> +				   "%p", (void *)reg->start);
> +
> +	if (reg->name &&
> +	    sysfs_create_link(&cma_sysfs_regions, &reg->kobj, reg->name) < 0)
> +		/* Ignore any errors. */;

That's handy, why do you think the kernel is FORCING you to check the
return value of this function?  So you can then just ignore it?  It's as
if we add this code for absolutely no reason, sometimes I wonder why I
even bother...

In short, ick, ick, ick.

Oh, and go read the documentation, it's there for a reason.

greg k-h

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFCv5 3/9] mm: cma: Added SysFS support
@ 2010-09-06 21:07     ` Greg KH
  0 siblings, 0 replies; 59+ messages in thread
From: Greg KH @ 2010-09-06 21:07 UTC (permalink / raw)
  To: Michal Nazarewicz
  Cc: linux-arm-kernel, linux-media, linux-mm, Andrew Morton,
	Daniel Walker, FUJITA Tomonori, Hans Verkuil, Jonathan Corbet,
	KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk, Kyungmin Park,
	Marek Szyprowski, Mel Gorman, Minchan Kim, Pawel Osciak,
	Peter Zijlstra, Russell King, Zach Pfeffer, linux-kernel

On Mon, Sep 06, 2010 at 08:33:53AM +0200, Michal Nazarewicz wrote:
> +++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> @@ -0,0 +1,53 @@
> +What:		/sys/kernel/mm/contiguous/
> +Date:		August 2010
> +Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
> +Description:
> +		If CMA has been built with SysFS support,
> +		/sys/kernel/mm/contiguous/ contains a file called
> +		"map", a file called "allocators" and a directory
> +		called "regions".
> +
> +		The "map" file lets one change the CMA's map attribute
> +		at run-time.

How?

> +		The "allocators" file list all registered allocators.
> +		Allocators with no name are listed as a single minus
> +		sign.

So this returns more than one value?

> +		The "regions" directory list all reserved regions.

Same here?

> --- a/Documentation/contiguous-memory.txt
> +++ b/Documentation/contiguous-memory.txt
> @@ -256,6 +256,10 @@
>       iff it matched in previous pattern.  If the second part is
>       omitted it will mach any type of memory requested by device.
>  
> +     If SysFS support is enabled, this attribute is accessible via
> +     SysFS and can be changed at run-time by writing to
> +     /sys/kernel/mm/contiguous/map.

it is "sysfs" no wierd "SysFS" please.

> diff --git a/include/linux/cma.h b/include/linux/cma.h
> index f6f9cb5..d0f41f4 100644
> --- a/include/linux/cma.h
> +++ b/include/linux/cma.h
> @@ -18,6 +18,9 @@
>  #include <linux/rbtree.h>
>  #include <linux/list.h>
>  #include <linux/init.h>
> +#if defined CONFIG_CMA_SYSFS
> +#  include <linux/kobject.h>
> +#endif

Just always include it, no need to put a #ifdef around it, right?

>  
>  struct device;
> @@ -204,6 +207,10 @@ struct cma_region {
>  	unsigned users;
>  	struct list_head list;
>  
> +#if defined CONFIG_CMA_SYSFS
> +	struct kobject kobj;
> +#endif

Same here, no need for a #ifdef, right?

Especially as it will change your lifetime rules for this object, why
would you not want this to always be the case?

> +
>  	unsigned used:1;
>  	unsigned registered:1;
>  	unsigned reserved:1;
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 86043a3..8bed799 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -322,16 +322,36 @@ config CMA
>  	  For more information see <Documentation/contiguous-memory.txt>.
>  	  If unsure, say "n".
>  
> -config CMA_DEBUG
> -	bool "CMA debug messages (DEVELOPEMENT)"
> +config CMA_DEVELOPEMENT
> +	bool "Include CMA developement features"
>  	depends on CMA
>  	help
> +	  This lets you enable some developement features of the CMA
> +	  framework.  It does not add any code to the kernel.
> +
> +	  Those options are mostly usable during development and testing.
> +	  If unsure, say "n".
> +
> +config CMA_DEBUG
> +	bool "CMA debug messages"
> +	depends on CMA_DEVELOPEMENT
> +	help
>  	  Turns on debug messages in CMA.  This produces KERN_DEBUG
>  	  messages for every CMA call as well as various messages while
>  	  processing calls such as cma_alloc().  This option does not
>  	  affect warning and error messages.
>  
> -	  This is mostly used during development.  If unsure, say "n".
> +config CMA_SYSFS
> +	bool "CMA SysFS interface support"
> +	depends on CMA_DEVELOPEMENT
> +	help
> +	  Enable support for SysFS interface.  The interface is available
> +	  under /sys/kernel/mm/contiguous.  Each region and allocator is
> +	  represented there.
> +
> +	  For more information consult
> +	  <Documentation/contiguous-memory.txt> and
> +	  <Documentation/ABI/testing/sysfs-kernel-mm-contiguous> files.
>  
>  config CMA_BEST_FIT
>  	bool "CMA best-fit allocator"
> diff --git a/mm/cma.c b/mm/cma.c
> index 06d0d5a..955f08c 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -38,8 +38,8 @@
>  
>  
>  /*
> - * Protects cma_regions, cma_allocators, cma_map, cma_map_length, and
> - * cma_chunks_by_start.
> + * Protects cma_regions, cma_allocators, cma_map, cma_map_length,
> + * cma_kobj, cma_sysfs_regions and cma_chunks_by_start.
>   */
>  static DEFINE_MUTEX(cma_mutex);
>  
> @@ -143,7 +143,11 @@ int __init __must_check cma_early_region_register(struct cma_region *reg)
>  
>  /************************* Regions & Allocators *************************/
>  
> +static void __cma_sysfs_region_add(struct cma_region *reg);
> +
>  static int __cma_region_attach_alloc(struct cma_region *reg);
> +static void __maybe_unused __cma_region_detach_alloc(struct cma_region *reg);
> +
>  
>  /* List of all regions.  Named regions are kept before unnamed. */
>  static LIST_HEAD(cma_regions);
> @@ -226,6 +230,8 @@ int __must_check cma_region_register(struct cma_region *reg)
>  	else
>  		list_add_tail(&reg->list, &cma_regions);
>  
> +	__cma_sysfs_region_add(reg);
> +
>  done:
>  	mutex_unlock(&cma_mutex);
>  
> @@ -482,6 +488,329 @@ subsys_initcall(cma_init);
>  
>  
>  
> +/************************* SysFS *************************/
> +
> +#if defined CONFIG_CMA_SYSFS
> +
> +static struct kobject cma_sysfs_regions;

No static kobjects please, that means the code is broken.

> +static int cma_sysfs_regions_ready;
> +
> +
> +#define CMA_ATTR_INLINE(_type, _name)					\
> +	(&((struct cma_ ## _type ## _attribute){			\
> +		.attr	= {						\
> +			.name	= __stringify(_name),			\
> +			.mode	= 0644,					\
> +		},							\
> +		.show	= cma_sysfs_ ## _type ## _ ## _name ## _show,	\
> +		.store	= cma_sysfs_ ## _type ## _ ## _name ## _store,	\
> +	}).attr)
> +
> +#define CMA_ATTR_RO_INLINE(_type, _name)				\
> +	(&((struct cma_ ## _type ## _attribute){			\
> +		.attr	= {						\
> +			.name	= __stringify(_name),			\
> +			.mode	= 0444,					\
> +		},							\
> +		.show	= cma_sysfs_ ## _type ## _ ## _name ## _show,	\
> +	}).attr)
> +
> +
> +struct cma_root_attribute {
> +	struct attribute attr;
> +	ssize_t (*show)(char *buf);
> +	int (*store)(const char *buf);
> +};

Why can't you use a 'struct device' instead of using a "raw" kobject?

> +
> +static ssize_t cma_sysfs_root_map_show(char *page)
> +{
> +	ssize_t len;
> +
> +	len = cma_map_length;
> +	if (!len) {
> +		*page = 0;
> +		len = 0;
> +	} else {
> +		if (len > (size_t)PAGE_SIZE - 1)
> +			len = (size_t)PAGE_SIZE - 1;
> +		memcpy(page, cma_map, len);
> +		page[len++] = '\n';
> +	}
> +
> +	return len;
> +}

What are you returning here?  A "raw" buffer?  Why?  That's not what
sysfs is for.

> +
> +static int cma_sysfs_root_map_store(const char *page)
> +{
> +	ssize_t len = cma_map_validate(page);
> +	char *val = NULL;
> +
> +	if (len < 0)
> +		return len;
> +
> +	if (len) {
> +		val = kmemdup(page, len + 1, GFP_KERNEL);
> +		if (!val)
> +			return -ENOMEM;
> +		val[len] = '\0';
> +	}
> +
> +	kfree(cma_map);
> +	cma_map = val;
> +	cma_map_length = len;
> +
> +	return 0;
> +}

What is this doing?  It doesn't look like a "text" file to me, or am I
confused?

> +
> +static ssize_t cma_sysfs_root_allocators_show(char *page)
> +{
> +	struct cma_allocator *alloc;
> +	size_t left = PAGE_SIZE;
> +	char *ch = page;
> +
> +	cma_foreach_allocator(alloc) {
> +		ssize_t l = snprintf(ch, left, "%s ", alloc->name ?: "-");
> +		ch   += l;
> +		left -= l;
> +	}
> +
> +	if (ch != page)
> +		ch[-1] = '\n';
> +	return ch - page;
> +}

You are returning more than one value per sysfs file, why?

> +static ssize_t
> +cma_sysfs_root_show(struct kobject *kobj, struct attribute *attr, char *buf)
> +{
> +	struct cma_root_attribute *rattr =
> +		container_of(attr, struct cma_root_attribute, attr);
> +	ssize_t ret;
> +
> +	mutex_lock(&cma_mutex);
> +	ret = rattr->show(buf);

Are you sure you are not going to want to pass the attribute and kobject
to the show/store function?

> +	mutex_unlock(&cma_mutex);
> +
> +	return ret;
> +}
> +
> +static ssize_t
> +cma_sysfs_root_store(struct kobject *kobj, struct attribute *attr,
> +		       const char *buf, size_t count)
> +{
> +	struct cma_root_attribute *rattr =
> +		container_of(attr, struct cma_root_attribute, attr);
> +	int ret;
> +
> +	mutex_lock(&cma_mutex);
> +	ret = rattr->store(buf);
> +	mutex_unlock(&cma_mutex);
> +
> +	return ret < 0 ? ret : count;
> +}
> +
> +static struct kobj_type cma_sysfs_root_type = {
> +	.sysfs_ops	= &(const struct sysfs_ops){
> +		.show	= cma_sysfs_root_show,
> +		.store	= cma_sysfs_root_store,
> +	},
> +	.default_attrs	= (struct attribute * []) {
> +		CMA_ATTR_INLINE(root, map),
> +		CMA_ATTR_RO_INLINE(root, allocators),
> +		NULL
> +	},
> +};

This type is broken by design, and as per the documentation in the
kernel, I now get to publically ridicule you for doing so (as explained
in the documentation.)

Please go fix it up.

If you don't know why it is broken, then please, read the documentation
provided in the kernel, it's as if I write it for no one to later pay
attention to.  Why do I even bother...

> +
> +static int __init cma_sysfs_init(void)
> +{
> +	static struct kobject root;

Again, no static kobjects please.

> +	static struct kobj_type fake_type;
> +
> +	struct cma_region *reg;
> +	int ret;
> +
> +	/* Root */
> +	ret = kobject_init_and_add(&root, &cma_sysfs_root_type,
> +				   mm_kobj, "contiguous");
> +	if (unlikely(ret < 0)) {

Why "unlikely"?  Is this a fast-path or something?  Hint, I don't think
so...

> +		pr_err("init: unable to add root kobject: %d\n", ret);
> +		return ret;
> +	}
> +
> +	/* Regions */
> +	ret = kobject_init_and_add(&cma_sysfs_regions, &fake_type,
> +				   &root, "regions");
> +	if (unlikely(ret < 0)) {

Same thing here.

> +		pr_err("init: unable to add regions kobject: %d\n", ret);
> +		return ret;
> +	}
> +
> +	mutex_lock(&cma_mutex);
> +	cma_sysfs_regions_ready = 1;
> +	cma_foreach_region(reg)
> +		__cma_sysfs_region_add(reg);
> +	mutex_unlock(&cma_mutex);
> +
> +	return 0;
> +}
> +device_initcall(cma_sysfs_init);
> +
> +
> +
> +struct cma_region_attribute {
> +	struct attribute attr;
> +	ssize_t (*show)(struct cma_region *reg, char *buf);
> +	int (*store)(struct cma_region *reg, const char *buf);
> +};
> +
> +
> +static ssize_t cma_sysfs_region_name_show(struct cma_region *reg, char *page)
> +{
> +	return reg->name ? snprintf(page, PAGE_SIZE, "%s\n", reg->name) : 0;
> +}

Is a name field ever really going to be bigger than a page?

> +
> +static ssize_t cma_sysfs_region_start_show(struct cma_region *reg, char *page)
> +{
> +	return snprintf(page, PAGE_SIZE, "%p\n", (void *)reg->start);
> +}

Same for a start field.

> +
> +static ssize_t cma_sysfs_region_size_show(struct cma_region *reg, char *page)
> +{
> +	return snprintf(page, PAGE_SIZE, "%zu\n", reg->size);
> +}

Same for the size

> +
> +static ssize_t cma_sysfs_region_free_show(struct cma_region *reg, char *page)
> +{
> +	return snprintf(page, PAGE_SIZE, "%zu\n", reg->free_space);
> +}

Same here.

> +
> +static ssize_t cma_sysfs_region_users_show(struct cma_region *reg, char *page)
> +{
> +	return snprintf(page, PAGE_SIZE, "%u\n", reg->users);
> +}

And here

> +
> +static ssize_t cma_sysfs_region_alloc_show(struct cma_region *reg, char *page)
> +{
> +	if (reg->alloc)
> +		return snprintf(page, PAGE_SIZE, "%s\n",
> +				reg->alloc->name ?: "-");
> +	else if (reg->alloc_name)
> +		return snprintf(page, PAGE_SIZE, "[%s]\n", reg->alloc_name);
> +	else
> +		return 0;
> +}

Are all of these files in the Documentation/ABI directory?  I missed
that in your above description...


> +
> +static int
> +cma_sysfs_region_alloc_store(struct cma_region *reg, const char *page)
> +{
> +	char *s;
> +
> +	if (reg->alloc && reg->users)
> +		return -EBUSY;
> +
> +	if (!*page || *page == '\n') {
> +		s = NULL;
> +	} else {
> +		size_t len;
> +
> +		for (s = (char *)page; *++s && *s != '\n'; )
> +			/* nop */;
> +
> +		len = s - page;
> +		s = kmemdup(page, len + 1, GFP_KERNEL);
> +		if (!s)
> +			return -ENOMEM;
> +		s[len] = '\0';
> +	}
> +
> +	if (reg->alloc)
> +		__cma_region_detach_alloc(reg);
> +
> +	if (reg->free_alloc_name)
> +		kfree(reg->alloc_name);
> +
> +	reg->alloc_name = s;
> +	reg->free_alloc_name = !!s;
> +
> +	return 0;
> +}
> +
> +
> +static ssize_t
> +cma_sysfs_region_show(struct kobject *kobj, struct attribute *attr,
> +		      char *buf)
> +{
> +	struct cma_region *reg = container_of(kobj, struct cma_region, kobj);
> +	struct cma_region_attribute *rattr =
> +		container_of(attr, struct cma_region_attribute, attr);
> +	ssize_t ret;
> +
> +	mutex_lock(&cma_mutex);
> +	ret = rattr->show(reg, buf);
> +	mutex_unlock(&cma_mutex);
> +
> +	return ret;
> +}
> +
> +static int
> +cma_sysfs_region_store(struct kobject *kobj, struct attribute *attr,
> +		       const char *buf, size_t count)
> +{
> +	struct cma_region *reg = container_of(kobj, struct cma_region, kobj);
> +	struct cma_region_attribute *rattr =
> +		container_of(attr, struct cma_region_attribute, attr);
> +	int ret;
> +
> +	mutex_lock(&cma_mutex);
> +	ret = rattr->store(reg, buf);
> +	mutex_unlock(&cma_mutex);
> +
> +	return ret < 0 ? ret : count;
> +}
> +
> +static struct kobj_type cma_sysfs_region_type = {
> +	.sysfs_ops	= &(const struct sysfs_ops){
> +		.show	= cma_sysfs_region_show,
> +		.store	= cma_sysfs_region_store,
> +	},
> +	.default_attrs	= (struct attribute * []) {
> +		CMA_ATTR_RO_INLINE(region, name),
> +		CMA_ATTR_RO_INLINE(region, start),
> +		CMA_ATTR_RO_INLINE(region, size),
> +		CMA_ATTR_RO_INLINE(region, free),
> +		CMA_ATTR_RO_INLINE(region, users),
> +		CMA_ATTR_INLINE(region, alloc),
> +		NULL
> +	},
> +};

Again, broken by design, please fix.

> +
> +static void __cma_sysfs_region_add(struct cma_region *reg)
> +{
> +	int ret;
> +
> +	if (!cma_sysfs_regions_ready)
> +		return;
> +
> +	memset(&reg->kobj, 0, sizeof reg->kobj);
> +
> +	ret = kobject_init_and_add(&reg->kobj, &cma_sysfs_region_type,
> +				   &cma_sysfs_regions,
> +				   "%p", (void *)reg->start);
> +
> +	if (reg->name &&
> +	    sysfs_create_link(&cma_sysfs_regions, &reg->kobj, reg->name) < 0)
> +		/* Ignore any errors. */;

That's handy, why do you think the kernel is FORCING you to check the
return value of this function?  So you can then just ignore it?  It's as
if we add this code for absolutely no reason, sometimes I wonder why I
even bother...

In short, ick, ick, ick.

Oh, and go read the documentation, it's there for a reason.

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [RFCv5 3/9] mm: cma: Added SysFS support
@ 2010-09-06 21:07     ` Greg KH
  0 siblings, 0 replies; 59+ messages in thread
From: Greg KH @ 2010-09-06 21:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Sep 06, 2010 at 08:33:53AM +0200, Michal Nazarewicz wrote:
> +++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> @@ -0,0 +1,53 @@
> +What:		/sys/kernel/mm/contiguous/
> +Date:		August 2010
> +Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
> +Description:
> +		If CMA has been built with SysFS support,
> +		/sys/kernel/mm/contiguous/ contains a file called
> +		"map", a file called "allocators" and a directory
> +		called "regions".
> +
> +		The "map" file lets one change the CMA's map attribute
> +		at run-time.

How?

> +		The "allocators" file list all registered allocators.
> +		Allocators with no name are listed as a single minus
> +		sign.

So this returns more than one value?

> +		The "regions" directory list all reserved regions.

Same here?

> --- a/Documentation/contiguous-memory.txt
> +++ b/Documentation/contiguous-memory.txt
> @@ -256,6 +256,10 @@
>       iff it matched in previous pattern.  If the second part is
>       omitted it will mach any type of memory requested by device.
>  
> +     If SysFS support is enabled, this attribute is accessible via
> +     SysFS and can be changed at run-time by writing to
> +     /sys/kernel/mm/contiguous/map.

it is "sysfs" no wierd "SysFS" please.

> diff --git a/include/linux/cma.h b/include/linux/cma.h
> index f6f9cb5..d0f41f4 100644
> --- a/include/linux/cma.h
> +++ b/include/linux/cma.h
> @@ -18,6 +18,9 @@
>  #include <linux/rbtree.h>
>  #include <linux/list.h>
>  #include <linux/init.h>
> +#if defined CONFIG_CMA_SYSFS
> +#  include <linux/kobject.h>
> +#endif

Just always include it, no need to put a #ifdef around it, right?

>  
>  struct device;
> @@ -204,6 +207,10 @@ struct cma_region {
>  	unsigned users;
>  	struct list_head list;
>  
> +#if defined CONFIG_CMA_SYSFS
> +	struct kobject kobj;
> +#endif

Same here, no need for a #ifdef, right?

Especially as it will change your lifetime rules for this object, why
would you not want this to always be the case?

> +
>  	unsigned used:1;
>  	unsigned registered:1;
>  	unsigned reserved:1;
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 86043a3..8bed799 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -322,16 +322,36 @@ config CMA
>  	  For more information see <Documentation/contiguous-memory.txt>.
>  	  If unsure, say "n".
>  
> -config CMA_DEBUG
> -	bool "CMA debug messages (DEVELOPEMENT)"
> +config CMA_DEVELOPEMENT
> +	bool "Include CMA developement features"
>  	depends on CMA
>  	help
> +	  This lets you enable some developement features of the CMA
> +	  framework.  It does not add any code to the kernel.
> +
> +	  Those options are mostly usable during development and testing.
> +	  If unsure, say "n".
> +
> +config CMA_DEBUG
> +	bool "CMA debug messages"
> +	depends on CMA_DEVELOPEMENT
> +	help
>  	  Turns on debug messages in CMA.  This produces KERN_DEBUG
>  	  messages for every CMA call as well as various messages while
>  	  processing calls such as cma_alloc().  This option does not
>  	  affect warning and error messages.
>  
> -	  This is mostly used during development.  If unsure, say "n".
> +config CMA_SYSFS
> +	bool "CMA SysFS interface support"
> +	depends on CMA_DEVELOPEMENT
> +	help
> +	  Enable support for SysFS interface.  The interface is available
> +	  under /sys/kernel/mm/contiguous.  Each region and allocator is
> +	  represented there.
> +
> +	  For more information consult
> +	  <Documentation/contiguous-memory.txt> and
> +	  <Documentation/ABI/testing/sysfs-kernel-mm-contiguous> files.
>  
>  config CMA_BEST_FIT
>  	bool "CMA best-fit allocator"
> diff --git a/mm/cma.c b/mm/cma.c
> index 06d0d5a..955f08c 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -38,8 +38,8 @@
>  
>  
>  /*
> - * Protects cma_regions, cma_allocators, cma_map, cma_map_length, and
> - * cma_chunks_by_start.
> + * Protects cma_regions, cma_allocators, cma_map, cma_map_length,
> + * cma_kobj, cma_sysfs_regions and cma_chunks_by_start.
>   */
>  static DEFINE_MUTEX(cma_mutex);
>  
> @@ -143,7 +143,11 @@ int __init __must_check cma_early_region_register(struct cma_region *reg)
>  
>  /************************* Regions & Allocators *************************/
>  
> +static void __cma_sysfs_region_add(struct cma_region *reg);
> +
>  static int __cma_region_attach_alloc(struct cma_region *reg);
> +static void __maybe_unused __cma_region_detach_alloc(struct cma_region *reg);
> +
>  
>  /* List of all regions.  Named regions are kept before unnamed. */
>  static LIST_HEAD(cma_regions);
> @@ -226,6 +230,8 @@ int __must_check cma_region_register(struct cma_region *reg)
>  	else
>  		list_add_tail(&reg->list, &cma_regions);
>  
> +	__cma_sysfs_region_add(reg);
> +
>  done:
>  	mutex_unlock(&cma_mutex);
>  
> @@ -482,6 +488,329 @@ subsys_initcall(cma_init);
>  
>  
>  
> +/************************* SysFS *************************/
> +
> +#if defined CONFIG_CMA_SYSFS
> +
> +static struct kobject cma_sysfs_regions;

No static kobjects please, that means the code is broken.

> +static int cma_sysfs_regions_ready;
> +
> +
> +#define CMA_ATTR_INLINE(_type, _name)					\
> +	(&((struct cma_ ## _type ## _attribute){			\
> +		.attr	= {						\
> +			.name	= __stringify(_name),			\
> +			.mode	= 0644,					\
> +		},							\
> +		.show	= cma_sysfs_ ## _type ## _ ## _name ## _show,	\
> +		.store	= cma_sysfs_ ## _type ## _ ## _name ## _store,	\
> +	}).attr)
> +
> +#define CMA_ATTR_RO_INLINE(_type, _name)				\
> +	(&((struct cma_ ## _type ## _attribute){			\
> +		.attr	= {						\
> +			.name	= __stringify(_name),			\
> +			.mode	= 0444,					\
> +		},							\
> +		.show	= cma_sysfs_ ## _type ## _ ## _name ## _show,	\
> +	}).attr)
> +
> +
> +struct cma_root_attribute {
> +	struct attribute attr;
> +	ssize_t (*show)(char *buf);
> +	int (*store)(const char *buf);
> +};

Why can't you use a 'struct device' instead of using a "raw" kobject?

> +
> +static ssize_t cma_sysfs_root_map_show(char *page)
> +{
> +	ssize_t len;
> +
> +	len = cma_map_length;
> +	if (!len) {
> +		*page = 0;
> +		len = 0;
> +	} else {
> +		if (len > (size_t)PAGE_SIZE - 1)
> +			len = (size_t)PAGE_SIZE - 1;
> +		memcpy(page, cma_map, len);
> +		page[len++] = '\n';
> +	}
> +
> +	return len;
> +}

What are you returning here?  A "raw" buffer?  Why?  That's not what
sysfs is for.

> +
> +static int cma_sysfs_root_map_store(const char *page)
> +{
> +	ssize_t len = cma_map_validate(page);
> +	char *val = NULL;
> +
> +	if (len < 0)
> +		return len;
> +
> +	if (len) {
> +		val = kmemdup(page, len + 1, GFP_KERNEL);
> +		if (!val)
> +			return -ENOMEM;
> +		val[len] = '\0';
> +	}
> +
> +	kfree(cma_map);
> +	cma_map = val;
> +	cma_map_length = len;
> +
> +	return 0;
> +}

What is this doing?  It doesn't look like a "text" file to me, or am I
confused?

> +
> +static ssize_t cma_sysfs_root_allocators_show(char *page)
> +{
> +	struct cma_allocator *alloc;
> +	size_t left = PAGE_SIZE;
> +	char *ch = page;
> +
> +	cma_foreach_allocator(alloc) {
> +		ssize_t l = snprintf(ch, left, "%s ", alloc->name ?: "-");
> +		ch   += l;
> +		left -= l;
> +	}
> +
> +	if (ch != page)
> +		ch[-1] = '\n';
> +	return ch - page;
> +}

You are returning more than one value per sysfs file, why?

> +static ssize_t
> +cma_sysfs_root_show(struct kobject *kobj, struct attribute *attr, char *buf)
> +{
> +	struct cma_root_attribute *rattr =
> +		container_of(attr, struct cma_root_attribute, attr);
> +	ssize_t ret;
> +
> +	mutex_lock(&cma_mutex);
> +	ret = rattr->show(buf);

Are you sure you are not going to want to pass the attribute and kobject
to the show/store function?

> +	mutex_unlock(&cma_mutex);
> +
> +	return ret;
> +}
> +
> +static ssize_t
> +cma_sysfs_root_store(struct kobject *kobj, struct attribute *attr,
> +		       const char *buf, size_t count)
> +{
> +	struct cma_root_attribute *rattr =
> +		container_of(attr, struct cma_root_attribute, attr);
> +	int ret;
> +
> +	mutex_lock(&cma_mutex);
> +	ret = rattr->store(buf);
> +	mutex_unlock(&cma_mutex);
> +
> +	return ret < 0 ? ret : count;
> +}
> +
> +static struct kobj_type cma_sysfs_root_type = {
> +	.sysfs_ops	= &(const struct sysfs_ops){
> +		.show	= cma_sysfs_root_show,
> +		.store	= cma_sysfs_root_store,
> +	},
> +	.default_attrs	= (struct attribute * []) {
> +		CMA_ATTR_INLINE(root, map),
> +		CMA_ATTR_RO_INLINE(root, allocators),
> +		NULL
> +	},
> +};

This type is broken by design, and as per the documentation in the
kernel, I now get to publically ridicule you for doing so (as explained
in the documentation.)

Please go fix it up.

If you don't know why it is broken, then please, read the documentation
provided in the kernel, it's as if I write it for no one to later pay
attention to.  Why do I even bother...

> +
> +static int __init cma_sysfs_init(void)
> +{
> +	static struct kobject root;

Again, no static kobjects please.

> +	static struct kobj_type fake_type;
> +
> +	struct cma_region *reg;
> +	int ret;
> +
> +	/* Root */
> +	ret = kobject_init_and_add(&root, &cma_sysfs_root_type,
> +				   mm_kobj, "contiguous");
> +	if (unlikely(ret < 0)) {

Why "unlikely"?  Is this a fast-path or something?  Hint, I don't think
so...

> +		pr_err("init: unable to add root kobject: %d\n", ret);
> +		return ret;
> +	}
> +
> +	/* Regions */
> +	ret = kobject_init_and_add(&cma_sysfs_regions, &fake_type,
> +				   &root, "regions");
> +	if (unlikely(ret < 0)) {

Same thing here.

> +		pr_err("init: unable to add regions kobject: %d\n", ret);
> +		return ret;
> +	}
> +
> +	mutex_lock(&cma_mutex);
> +	cma_sysfs_regions_ready = 1;
> +	cma_foreach_region(reg)
> +		__cma_sysfs_region_add(reg);
> +	mutex_unlock(&cma_mutex);
> +
> +	return 0;
> +}
> +device_initcall(cma_sysfs_init);
> +
> +
> +
> +struct cma_region_attribute {
> +	struct attribute attr;
> +	ssize_t (*show)(struct cma_region *reg, char *buf);
> +	int (*store)(struct cma_region *reg, const char *buf);
> +};
> +
> +
> +static ssize_t cma_sysfs_region_name_show(struct cma_region *reg, char *page)
> +{
> +	return reg->name ? snprintf(page, PAGE_SIZE, "%s\n", reg->name) : 0;
> +}

Is a name field ever really going to be bigger than a page?

> +
> +static ssize_t cma_sysfs_region_start_show(struct cma_region *reg, char *page)
> +{
> +	return snprintf(page, PAGE_SIZE, "%p\n", (void *)reg->start);
> +}

Same for a start field.

> +
> +static ssize_t cma_sysfs_region_size_show(struct cma_region *reg, char *page)
> +{
> +	return snprintf(page, PAGE_SIZE, "%zu\n", reg->size);
> +}

Same for the size

> +
> +static ssize_t cma_sysfs_region_free_show(struct cma_region *reg, char *page)
> +{
> +	return snprintf(page, PAGE_SIZE, "%zu\n", reg->free_space);
> +}

Same here.

> +
> +static ssize_t cma_sysfs_region_users_show(struct cma_region *reg, char *page)
> +{
> +	return snprintf(page, PAGE_SIZE, "%u\n", reg->users);
> +}

And here

> +
> +static ssize_t cma_sysfs_region_alloc_show(struct cma_region *reg, char *page)
> +{
> +	if (reg->alloc)
> +		return snprintf(page, PAGE_SIZE, "%s\n",
> +				reg->alloc->name ?: "-");
> +	else if (reg->alloc_name)
> +		return snprintf(page, PAGE_SIZE, "[%s]\n", reg->alloc_name);
> +	else
> +		return 0;
> +}

Are all of these files in the Documentation/ABI directory?  I missed
that in your above description...


> +
> +static int
> +cma_sysfs_region_alloc_store(struct cma_region *reg, const char *page)
> +{
> +	char *s;
> +
> +	if (reg->alloc && reg->users)
> +		return -EBUSY;
> +
> +	if (!*page || *page == '\n') {
> +		s = NULL;
> +	} else {
> +		size_t len;
> +
> +		for (s = (char *)page; *++s && *s != '\n'; )
> +			/* nop */;
> +
> +		len = s - page;
> +		s = kmemdup(page, len + 1, GFP_KERNEL);
> +		if (!s)
> +			return -ENOMEM;
> +		s[len] = '\0';
> +	}
> +
> +	if (reg->alloc)
> +		__cma_region_detach_alloc(reg);
> +
> +	if (reg->free_alloc_name)
> +		kfree(reg->alloc_name);
> +
> +	reg->alloc_name = s;
> +	reg->free_alloc_name = !!s;
> +
> +	return 0;
> +}
> +
> +
> +static ssize_t
> +cma_sysfs_region_show(struct kobject *kobj, struct attribute *attr,
> +		      char *buf)
> +{
> +	struct cma_region *reg = container_of(kobj, struct cma_region, kobj);
> +	struct cma_region_attribute *rattr =
> +		container_of(attr, struct cma_region_attribute, attr);
> +	ssize_t ret;
> +
> +	mutex_lock(&cma_mutex);
> +	ret = rattr->show(reg, buf);
> +	mutex_unlock(&cma_mutex);
> +
> +	return ret;
> +}
> +
> +static int
> +cma_sysfs_region_store(struct kobject *kobj, struct attribute *attr,
> +		       const char *buf, size_t count)
> +{
> +	struct cma_region *reg = container_of(kobj, struct cma_region, kobj);
> +	struct cma_region_attribute *rattr =
> +		container_of(attr, struct cma_region_attribute, attr);
> +	int ret;
> +
> +	mutex_lock(&cma_mutex);
> +	ret = rattr->store(reg, buf);
> +	mutex_unlock(&cma_mutex);
> +
> +	return ret < 0 ? ret : count;
> +}
> +
> +static struct kobj_type cma_sysfs_region_type = {
> +	.sysfs_ops	= &(const struct sysfs_ops){
> +		.show	= cma_sysfs_region_show,
> +		.store	= cma_sysfs_region_store,
> +	},
> +	.default_attrs	= (struct attribute * []) {
> +		CMA_ATTR_RO_INLINE(region, name),
> +		CMA_ATTR_RO_INLINE(region, start),
> +		CMA_ATTR_RO_INLINE(region, size),
> +		CMA_ATTR_RO_INLINE(region, free),
> +		CMA_ATTR_RO_INLINE(region, users),
> +		CMA_ATTR_INLINE(region, alloc),
> +		NULL
> +	},
> +};

Again, broken by design, please fix.

> +
> +static void __cma_sysfs_region_add(struct cma_region *reg)
> +{
> +	int ret;
> +
> +	if (!cma_sysfs_regions_ready)
> +		return;
> +
> +	memset(&reg->kobj, 0, sizeof reg->kobj);
> +
> +	ret = kobject_init_and_add(&reg->kobj, &cma_sysfs_region_type,
> +				   &cma_sysfs_regions,
> +				   "%p", (void *)reg->start);
> +
> +	if (reg->name &&
> +	    sysfs_create_link(&cma_sysfs_regions, &reg->kobj, reg->name) < 0)
> +		/* Ignore any errors. */;

That's handy, why do you think the kernel is FORCING you to check the
return value of this function?  So you can then just ignore it?  It's as
if we add this code for absolutely no reason, sometimes I wonder why I
even bother...

In short, ick, ick, ick.

Oh, and go read the documentation, it's there for a reason.

greg k-h

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFCv5 0/9] CMA + VCMM integration
  2010-09-06  6:33 ` Michal Nazarewicz
  (?)
@ 2010-09-06 21:09   ` Greg KH
  -1 siblings, 0 replies; 59+ messages in thread
From: Greg KH @ 2010-09-06 21:09 UTC (permalink / raw)
  To: Michal Nazarewicz
  Cc: linux-arm-kernel, linux-media, linux-mm, Andrew Morton,
	Daniel Walker, FUJITA Tomonori, Hans Verkuil, Jonathan Corbet,
	KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk, Kyungmin Park,
	Marek Szyprowski, Mel Gorman, Minchan Kim, Pawel Osciak,
	Peter Zijlstra, Russell King, Zach Pfeffer, linux-kernel

On Mon, Sep 06, 2010 at 08:33:50AM +0200, Michal Nazarewicz wrote:
> Hello everyone,
> 
> This patchset introduces a draft of a redesign of Zach Pfeffer's
> VCMM.

What is a VCMM?

What is a CMA?

> Not all of the functionality of the original VCMM has been
> ported into this patchset.  This is mostly meant as RFC.  Moreover,
> the code for VCMM implementation in this RFC has not been tested.

If you haven't even tested it, why should we review it?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFCv5 0/9] CMA + VCMM integration
@ 2010-09-06 21:09   ` Greg KH
  0 siblings, 0 replies; 59+ messages in thread
From: Greg KH @ 2010-09-06 21:09 UTC (permalink / raw)
  To: Michal Nazarewicz
  Cc: linux-arm-kernel, linux-media, linux-mm, Andrew Morton,
	Daniel Walker, FUJITA Tomonori, Hans Verkuil, Jonathan Corbet,
	KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk, Kyungmin Park,
	Marek Szyprowski, Mel Gorman, Minchan Kim, Pawel Osciak,
	Peter Zijlstra, Russell King, Zach Pfeffer, linux-kernel

On Mon, Sep 06, 2010 at 08:33:50AM +0200, Michal Nazarewicz wrote:
> Hello everyone,
> 
> This patchset introduces a draft of a redesign of Zach Pfeffer's
> VCMM.

What is a VCMM?

What is a CMA?

> Not all of the functionality of the original VCMM has been
> ported into this patchset.  This is mostly meant as RFC.  Moreover,
> the code for VCMM implementation in this RFC has not been tested.

If you haven't even tested it, why should we review it?

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [RFCv5 0/9] CMA + VCMM integration
@ 2010-09-06 21:09   ` Greg KH
  0 siblings, 0 replies; 59+ messages in thread
From: Greg KH @ 2010-09-06 21:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Sep 06, 2010 at 08:33:50AM +0200, Michal Nazarewicz wrote:
> Hello everyone,
> 
> This patchset introduces a draft of a redesign of Zach Pfeffer's
> VCMM.

What is a VCMM?

What is a CMA?

> Not all of the functionality of the original VCMM has been
> ported into this patchset.  This is mostly meant as RFC.  Moreover,
> the code for VCMM implementation in this RFC has not been tested.

If you haven't even tested it, why should we review it?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFCv5 8/9] mm: vcm: Sample driver added
  2010-09-06  6:33   ` Michal Nazarewicz
  (?)
@ 2010-09-06 21:10     ` Greg KH
  -1 siblings, 0 replies; 59+ messages in thread
From: Greg KH @ 2010-09-06 21:10 UTC (permalink / raw)
  To: Michal Nazarewicz
  Cc: linux-arm-kernel, linux-media, linux-mm, Andrew Morton,
	Daniel Walker, FUJITA Tomonori, Hans Verkuil, Jonathan Corbet,
	KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk, Kyungmin Park,
	Marek Szyprowski, Mel Gorman, Minchan Kim, Pawel Osciak,
	Peter Zijlstra, Russell King, Zach Pfeffer, linux-kernel

On Mon, Sep 06, 2010 at 08:33:58AM +0200, Michal Nazarewicz wrote:
> --- /dev/null
> +++ b/include/linux/vcm-sample.h

Don't put "sample" code in include/linux/ please.  That's just
cluttering up the place, don't you think?  Especially as no one else
needs the file there...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFCv5 8/9] mm: vcm: Sample driver added
@ 2010-09-06 21:10     ` Greg KH
  0 siblings, 0 replies; 59+ messages in thread
From: Greg KH @ 2010-09-06 21:10 UTC (permalink / raw)
  To: Michal Nazarewicz
  Cc: linux-arm-kernel, linux-media, linux-mm, Andrew Morton,
	Daniel Walker, FUJITA Tomonori, Hans Verkuil, Jonathan Corbet,
	KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk, Kyungmin Park,
	Marek Szyprowski, Mel Gorman, Minchan Kim, Pawel Osciak,
	Peter Zijlstra, Russell King, Zach Pfeffer, linux-kernel

On Mon, Sep 06, 2010 at 08:33:58AM +0200, Michal Nazarewicz wrote:
> --- /dev/null
> +++ b/include/linux/vcm-sample.h

Don't put "sample" code in include/linux/ please.  That's just
cluttering up the place, don't you think?  Especially as no one else
needs the file there...

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [RFCv5 8/9] mm: vcm: Sample driver added
@ 2010-09-06 21:10     ` Greg KH
  0 siblings, 0 replies; 59+ messages in thread
From: Greg KH @ 2010-09-06 21:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Sep 06, 2010 at 08:33:58AM +0200, Michal Nazarewicz wrote:
> --- /dev/null
> +++ b/include/linux/vcm-sample.h

Don't put "sample" code in include/linux/ please.  That's just
cluttering up the place, don't you think?  Especially as no one else
needs the file there...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFCv5 0/9] CMA + VCMM integration
  2010-09-06 21:09   ` Greg KH
  (?)
@ 2010-09-07  1:40     ` Michał Nazarewicz
  -1 siblings, 0 replies; 59+ messages in thread
From: Michał Nazarewicz @ 2010-09-07  1:40 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-arm-kernel, linux-media, linux-mm, Andrew Morton,
	Daniel Walker, FUJITA Tomonori, Hans Verkuil, Jonathan Corbet,
	KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk, Kyungmin Park,
	Marek Szyprowski, Mel Gorman, Minchan Kim, Pawel Osciak,
	Peter Zijlstra, Russell King, Zach Pfeffer, linux-kernel

On Mon, 06 Sep 2010 23:09:05 +0200, Greg KH <greg@kroah.com> wrote:

> On Mon, Sep 06, 2010 at 08:33:50AM +0200, Michal Nazarewicz wrote:
>> Hello everyone,
>>
>> This patchset introduces a draft of a redesign of Zach Pfeffer's
>> VCMM.
>
> What is a VCMM?

Virtual Contiguous Memory Manager.  The version posted by Zach can
be found at: <http://article.gmane.org/gmane.linux.kernel.mm/50090>.
It is an API for managing IO MMU and IO MMU mappings.

> What is a CMA?

Contiguous Memory Manager.  The v4 version can be found at
<http://marc.info/?l=linux-mm&m=128229799415817&w=2>.  It is an API for
allocating large, physically contiguous blocks of memory.

I haven't expected that anyone who haven't already participated in the
discussion about CMA and VCMM will get interested by this patchset
so I was a bit vague in the cover letter.  Sorry about that.

>> Not all of the functionality of the original VCMM has been
>> ported into this patchset.  This is mostly meant as RFC.  Moreover,
>> the code for VCMM implementation in this RFC has not been tested.

> If you haven't even tested it, why should we review it?

Ignore the code then and look just at the documentation, please.
I wanted to post what I have to receive comments about the general
idea and not necessarily the code itself.  Code is just a mean to show
how I see the implementation of the idea described in the documentation.
Because of all that, I marked the patchset as a RFC rather than a PATCH.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Michał "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFCv5 0/9] CMA + VCMM integration
@ 2010-09-07  1:40     ` Michał Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michał Nazarewicz @ 2010-09-07  1:40 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-arm-kernel, linux-media, linux-mm, Andrew Morton,
	Daniel Walker, FUJITA Tomonori, Hans Verkuil, Jonathan Corbet,
	KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk, Kyungmin Park,
	Marek Szyprowski, Mel Gorman, Minchan Kim, Pawel Osciak,
	Peter Zijlstra, Russell King, Zach Pfeffer, linux-kernel

On Mon, 06 Sep 2010 23:09:05 +0200, Greg KH <greg@kroah.com> wrote:

> On Mon, Sep 06, 2010 at 08:33:50AM +0200, Michal Nazarewicz wrote:
>> Hello everyone,
>>
>> This patchset introduces a draft of a redesign of Zach Pfeffer's
>> VCMM.
>
> What is a VCMM?

Virtual Contiguous Memory Manager.  The version posted by Zach can
be found at: <http://article.gmane.org/gmane.linux.kernel.mm/50090>.
It is an API for managing IO MMU and IO MMU mappings.

> What is a CMA?

Contiguous Memory Manager.  The v4 version can be found at
<http://marc.info/?l=linux-mm&m=128229799415817&w=2>.  It is an API for
allocating large, physically contiguous blocks of memory.

I haven't expected that anyone who haven't already participated in the
discussion about CMA and VCMM will get interested by this patchset
so I was a bit vague in the cover letter.  Sorry about that.

>> Not all of the functionality of the original VCMM has been
>> ported into this patchset.  This is mostly meant as RFC.  Moreover,
>> the code for VCMM implementation in this RFC has not been tested.

> If you haven't even tested it, why should we review it?

Ignore the code then and look just at the documentation, please.
I wanted to post what I have to receive comments about the general
idea and not necessarily the code itself.  Code is just a mean to show
how I see the implementation of the idea described in the documentation.
Because of all that, I marked the patchset as a RFC rather than a PATCH.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Michał "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [RFCv5 0/9] CMA + VCMM integration
@ 2010-09-07  1:40     ` Michał Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michał Nazarewicz @ 2010-09-07  1:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 06 Sep 2010 23:09:05 +0200, Greg KH <greg@kroah.com> wrote:

> On Mon, Sep 06, 2010 at 08:33:50AM +0200, Michal Nazarewicz wrote:
>> Hello everyone,
>>
>> This patchset introduces a draft of a redesign of Zach Pfeffer's
>> VCMM.
>
> What is a VCMM?

Virtual Contiguous Memory Manager.  The version posted by Zach can
be found at: <http://article.gmane.org/gmane.linux.kernel.mm/50090>.
It is an API for managing IO MMU and IO MMU mappings.

> What is a CMA?

Contiguous Memory Manager.  The v4 version can be found at
<http://marc.info/?l=linux-mm&m=128229799415817&w=2>.  It is an API for
allocating large, physically contiguous blocks of memory.

I haven't expected that anyone who haven't already participated in the
discussion about CMA and VCMM will get interested by this patchset
so I was a bit vague in the cover letter.  Sorry about that.

>> Not all of the functionality of the original VCMM has been
>> ported into this patchset.  This is mostly meant as RFC.  Moreover,
>> the code for VCMM implementation in this RFC has not been tested.

> If you haven't even tested it, why should we review it?

Ignore the code then and look just at the documentation, please.
I wanted to post what I have to receive comments about the general
idea and not necessarily the code itself.  Code is just a mean to show
how I see the implementation of the idea described in the documentation.
Because of all that, I marked the patchset as a RFC rather than a PATCH.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Micha? "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFCv5 8/9] mm: vcm: Sample driver added
  2010-09-06 21:10     ` Greg KH
  (?)
@ 2010-09-07  1:58       ` Michał Nazarewicz
  -1 siblings, 0 replies; 59+ messages in thread
From: Michał Nazarewicz @ 2010-09-07  1:58 UTC (permalink / raw)
  To: Greg KH
  Cc: Hans Verkuil, Peter Zijlstra, Daniel Walker, Russell King,
	Jonathan Corbet, Mel Gorman, Pawel Osciak, Konrad Rzeszutek Wilk,
	linux-kernel, FUJITA Tomonori, linux-mm, Kyungmin Park,
	Minchan Kim, Zach Pfeffer, KAMEZAWA Hiroyuki, Andrew Morton,
	Marek Szyprowski, linux-arm-kernel, linux-media

On Mon, 06 Sep 2010 23:10:54 +0200, Greg KH <greg@kroah.com> wrote:

> On Mon, Sep 06, 2010 at 08:33:58AM +0200, Michal Nazarewicz wrote:
>> --- /dev/null
>> +++ b/include/linux/vcm-sample.h
>
> Don't put "sample" code in include/linux/ please.  That's just
> cluttering up the place, don't you think?  Especially as no one else
> needs the file there...

Absolutely true.  My plan is to put a real driver in place of the sample
driver and post it with v6.  For now I just wanted to put a piece of code
that will look like a driver for presentation purposes.  Sorry for the
confusion.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Michał "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFCv5 8/9] mm: vcm: Sample driver added
@ 2010-09-07  1:58       ` Michał Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michał Nazarewicz @ 2010-09-07  1:58 UTC (permalink / raw)
  To: Greg KH
  Cc: Hans Verkuil, Peter Zijlstra, Daniel Walker, Russell King,
	Jonathan Corbet, Mel Gorman, Pawel Osciak, Konrad Rzeszutek Wilk,
	linux-kernel, FUJITA Tomonori, linux-mm, Kyungmin Park,
	Minchan Kim, Zach Pfeffer, KAMEZAWA Hiroyuki, Andrew Morton,
	Marek Szyprowski, linux-arm-kernel, linux-media

On Mon, 06 Sep 2010 23:10:54 +0200, Greg KH <greg@kroah.com> wrote:

> On Mon, Sep 06, 2010 at 08:33:58AM +0200, Michal Nazarewicz wrote:
>> --- /dev/null
>> +++ b/include/linux/vcm-sample.h
>
> Don't put "sample" code in include/linux/ please.  That's just
> cluttering up the place, don't you think?  Especially as no one else
> needs the file there...

Absolutely true.  My plan is to put a real driver in place of the sample
driver and post it with v6.  For now I just wanted to put a piece of code
that will look like a driver for presentation purposes.  Sorry for the
confusion.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Michał "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [RFCv5 8/9] mm: vcm: Sample driver added
@ 2010-09-07  1:58       ` Michał Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michał Nazarewicz @ 2010-09-07  1:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 06 Sep 2010 23:10:54 +0200, Greg KH <greg@kroah.com> wrote:

> On Mon, Sep 06, 2010 at 08:33:58AM +0200, Michal Nazarewicz wrote:
>> --- /dev/null
>> +++ b/include/linux/vcm-sample.h
>
> Don't put "sample" code in include/linux/ please.  That's just
> cluttering up the place, don't you think?  Especially as no one else
> needs the file there...

Absolutely true.  My plan is to put a real driver in place of the sample
driver and post it with v6.  For now I just wanted to put a piece of code
that will look like a driver for presentation purposes.  Sorry for the
confusion.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Micha? "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFCv5 0/9] CMA + VCMM integration
  2010-09-07  1:40     ` Michał Nazarewicz
  (?)
@ 2010-09-07  2:34       ` Greg KH
  -1 siblings, 0 replies; 59+ messages in thread
From: Greg KH @ 2010-09-07  2:34 UTC (permalink / raw)
  To: Micha?? Nazarewicz
  Cc: linux-arm-kernel, linux-media, linux-mm, Andrew Morton,
	Daniel Walker, FUJITA Tomonori, Hans Verkuil, Jonathan Corbet,
	KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk, Kyungmin Park,
	Marek Szyprowski, Mel Gorman, Minchan Kim, Pawel Osciak,
	Peter Zijlstra, Russell King, Zach Pfeffer, linux-kernel

On Tue, Sep 07, 2010 at 03:40:46AM +0200, Micha?? Nazarewicz wrote:
> On Mon, 06 Sep 2010 23:09:05 +0200, Greg KH <greg@kroah.com> wrote:
>
>> On Mon, Sep 06, 2010 at 08:33:50AM +0200, Michal Nazarewicz wrote:
>>> Hello everyone,
>>>
>>> This patchset introduces a draft of a redesign of Zach Pfeffer's
>>> VCMM.
>>
>> What is a VCMM?
>
> Virtual Contiguous Memory Manager.  The version posted by Zach can
> be found at: <http://article.gmane.org/gmane.linux.kernel.mm/50090>.
> It is an API for managing IO MMU and IO MMU mappings.
>
>> What is a CMA?
>
> Contiguous Memory Manager.  The v4 version can be found at
> <http://marc.info/?l=linux-mm&m=128229799415817&w=2>.  It is an API for
> allocating large, physically contiguous blocks of memory.
>
> I haven't expected that anyone who haven't already participated in the
> discussion about CMA and VCMM will get interested by this patchset
> so I was a bit vague in the cover letter.  Sorry about that.
>
>>> Not all of the functionality of the original VCMM has been
>>> ported into this patchset.  This is mostly meant as RFC.  Moreover,
>>> the code for VCMM implementation in this RFC has not been tested.
>
>> If you haven't even tested it, why should we review it?
>
> Ignore the code then and look just at the documentation, please.
> I wanted to post what I have to receive comments about the general
> idea and not necessarily the code itself.  Code is just a mean to show
> how I see the implementation of the idea described in the documentation.
> Because of all that, I marked the patchset as a RFC rather than a PATCH.

Oops, I looked at the code, sorry :)

greg k-h

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFCv5 0/9] CMA + VCMM integration
@ 2010-09-07  2:34       ` Greg KH
  0 siblings, 0 replies; 59+ messages in thread
From: Greg KH @ 2010-09-07  2:34 UTC (permalink / raw)
  To: Micha?? Nazarewicz
  Cc: linux-arm-kernel, linux-media, linux-mm, Andrew Morton,
	Daniel Walker, FUJITA Tomonori, Hans Verkuil, Jonathan Corbet,
	KAMEZAWA Hiroyuki, Konrad Rzeszutek Wilk, Kyungmin Park,
	Marek Szyprowski, Mel Gorman, Minchan Kim, Pawel Osciak,
	Peter Zijlstra, Russell King, Zach Pfeffer, linux-kernel

On Tue, Sep 07, 2010 at 03:40:46AM +0200, Micha?? Nazarewicz wrote:
> On Mon, 06 Sep 2010 23:09:05 +0200, Greg KH <greg@kroah.com> wrote:
>
>> On Mon, Sep 06, 2010 at 08:33:50AM +0200, Michal Nazarewicz wrote:
>>> Hello everyone,
>>>
>>> This patchset introduces a draft of a redesign of Zach Pfeffer's
>>> VCMM.
>>
>> What is a VCMM?
>
> Virtual Contiguous Memory Manager.  The version posted by Zach can
> be found at: <http://article.gmane.org/gmane.linux.kernel.mm/50090>.
> It is an API for managing IO MMU and IO MMU mappings.
>
>> What is a CMA?
>
> Contiguous Memory Manager.  The v4 version can be found at
> <http://marc.info/?l=linux-mm&m=128229799415817&w=2>.  It is an API for
> allocating large, physically contiguous blocks of memory.
>
> I haven't expected that anyone who haven't already participated in the
> discussion about CMA and VCMM will get interested by this patchset
> so I was a bit vague in the cover letter.  Sorry about that.
>
>>> Not all of the functionality of the original VCMM has been
>>> ported into this patchset.  This is mostly meant as RFC.  Moreover,
>>> the code for VCMM implementation in this RFC has not been tested.
>
>> If you haven't even tested it, why should we review it?
>
> Ignore the code then and look just at the documentation, please.
> I wanted to post what I have to receive comments about the general
> idea and not necessarily the code itself.  Code is just a mean to show
> how I see the implementation of the idea described in the documentation.
> Because of all that, I marked the patchset as a RFC rather than a PATCH.

Oops, I looked at the code, sorry :)

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [RFCv5 0/9] CMA + VCMM integration
@ 2010-09-07  2:34       ` Greg KH
  0 siblings, 0 replies; 59+ messages in thread
From: Greg KH @ 2010-09-07  2:34 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Sep 07, 2010 at 03:40:46AM +0200, Micha?? Nazarewicz wrote:
> On Mon, 06 Sep 2010 23:09:05 +0200, Greg KH <greg@kroah.com> wrote:
>
>> On Mon, Sep 06, 2010 at 08:33:50AM +0200, Michal Nazarewicz wrote:
>>> Hello everyone,
>>>
>>> This patchset introduces a draft of a redesign of Zach Pfeffer's
>>> VCMM.
>>
>> What is a VCMM?
>
> Virtual Contiguous Memory Manager.  The version posted by Zach can
> be found at: <http://article.gmane.org/gmane.linux.kernel.mm/50090>.
> It is an API for managing IO MMU and IO MMU mappings.
>
>> What is a CMA?
>
> Contiguous Memory Manager.  The v4 version can be found at
> <http://marc.info/?l=linux-mm&m=128229799415817&w=2>.  It is an API for
> allocating large, physically contiguous blocks of memory.
>
> I haven't expected that anyone who haven't already participated in the
> discussion about CMA and VCMM will get interested by this patchset
> so I was a bit vague in the cover letter.  Sorry about that.
>
>>> Not all of the functionality of the original VCMM has been
>>> ported into this patchset.  This is mostly meant as RFC.  Moreover,
>>> the code for VCMM implementation in this RFC has not been tested.
>
>> If you haven't even tested it, why should we review it?
>
> Ignore the code then and look just at the documentation, please.
> I wanted to post what I have to receive comments about the general
> idea and not necessarily the code itself.  Code is just a mean to show
> how I see the implementation of the idea described in the documentation.
> Because of all that, I marked the patchset as a RFC rather than a PATCH.

Oops, I looked at the code, sorry :)

greg k-h

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFCv5 3/9] mm: cma: Added SysFS support
  2010-09-06 21:07     ` Greg KH
  (?)
@ 2010-09-07  5:31       ` Michał Nazarewicz
  -1 siblings, 0 replies; 59+ messages in thread
From: Michał Nazarewicz @ 2010-09-07  5:31 UTC (permalink / raw)
  To: Greg KH
  Cc: Hans Verkuil, Peter Zijlstra, Daniel Walker, Russell King,
	Jonathan Corbet, Mel Gorman, Pawel Osciak, Konrad Rzeszutek Wilk,
	linux-kernel, FUJITA Tomonori, linux-mm, Kyungmin Park,
	Minchan Kim, Zach Pfeffer, KAMEZAWA Hiroyuki, Andrew Morton,
	Marek Szyprowski, linux-arm-kernel, linux-media

Hello Greg,

Thanks for reviewing the sysfs part.  Actually, I was never really sure
if I shouldn't rather put this code to debugfs and you got me convinced
that I should.  Sysfs somehow looked more appealing from kernel's API
point of view -- things seem to be more organised in sysfs than in
debugfs.  It seems I'll have to port it to debugfs after all

Nonetheless, a few responses to your comments:

> On Mon, Sep 06, 2010 at 08:33:53AM +0200, Michal Nazarewicz wrote:
>> +		The "allocators" file list all registered allocators.
>> +		Allocators with no name are listed as a single minus
>> +		sign.

On Mon, 06 Sep 2010 23:07:47 +0200, Greg KH <greg@kroah.com> wrote:
> So this returns more than one value?

Aren't thing like cpufreq governors listed in a single sysfs file?
I remember there was such a file somewhere.  Has that been made
deprecated? I cannot seem to find any information on that.

>> +		The "regions" directory list all reserved regions.
>
> Same here?

regions is actually a directory with subdirectories for each
region. ;)

>> +static ssize_t cma_sysfs_region_name_show(struct cma_region *reg, char *page)
>> +{
>> +	return reg->name ? snprintf(page, PAGE_SIZE, "%s\n", reg->name) : 0;
>> +}

> Is a name field ever really going to be bigger than a page?

I prefer being on the safe side -- I have no idea what user will provide
as region name so I assume as little as possible.  For numeric values you
are right that snprintf() is a bit paranoid, still I see no good reason
why not to use it.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Michał "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFCv5 3/9] mm: cma: Added SysFS support
@ 2010-09-07  5:31       ` Michał Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michał Nazarewicz @ 2010-09-07  5:31 UTC (permalink / raw)
  To: Greg KH
  Cc: Hans Verkuil, Peter Zijlstra, Daniel Walker, Russell King,
	Jonathan Corbet, Mel Gorman, Pawel Osciak, Konrad Rzeszutek Wilk,
	linux-kernel, FUJITA Tomonori, linux-mm, Kyungmin Park,
	Minchan Kim, Zach Pfeffer, KAMEZAWA Hiroyuki, Andrew Morton,
	Marek Szyprowski, linux-arm-kernel, linux-media

Hello Greg,

Thanks for reviewing the sysfs part.  Actually, I was never really sure
if I shouldn't rather put this code to debugfs and you got me convinced
that I should.  Sysfs somehow looked more appealing from kernel's API
point of view -- things seem to be more organised in sysfs than in
debugfs.  It seems I'll have to port it to debugfs after all

Nonetheless, a few responses to your comments:

> On Mon, Sep 06, 2010 at 08:33:53AM +0200, Michal Nazarewicz wrote:
>> +		The "allocators" file list all registered allocators.
>> +		Allocators with no name are listed as a single minus
>> +		sign.

On Mon, 06 Sep 2010 23:07:47 +0200, Greg KH <greg@kroah.com> wrote:
> So this returns more than one value?

Aren't thing like cpufreq governors listed in a single sysfs file?
I remember there was such a file somewhere.  Has that been made
deprecated? I cannot seem to find any information on that.

>> +		The "regions" directory list all reserved regions.
>
> Same here?

regions is actually a directory with subdirectories for each
region. ;)

>> +static ssize_t cma_sysfs_region_name_show(struct cma_region *reg, char *page)
>> +{
>> +	return reg->name ? snprintf(page, PAGE_SIZE, "%s\n", reg->name) : 0;
>> +}

> Is a name field ever really going to be bigger than a page?

I prefer being on the safe side -- I have no idea what user will provide
as region name so I assume as little as possible.  For numeric values you
are right that snprintf() is a bit paranoid, still I see no good reason
why not to use it.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Michał "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [RFCv5 3/9] mm: cma: Added SysFS support
@ 2010-09-07  5:31       ` Michał Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michał Nazarewicz @ 2010-09-07  5:31 UTC (permalink / raw)
  To: linux-arm-kernel

Hello Greg,

Thanks for reviewing the sysfs part.  Actually, I was never really sure
if I shouldn't rather put this code to debugfs and you got me convinced
that I should.  Sysfs somehow looked more appealing from kernel's API
point of view -- things seem to be more organised in sysfs than in
debugfs.  It seems I'll have to port it to debugfs after all

Nonetheless, a few responses to your comments:

> On Mon, Sep 06, 2010 at 08:33:53AM +0200, Michal Nazarewicz wrote:
>> +		The "allocators" file list all registered allocators.
>> +		Allocators with no name are listed as a single minus
>> +		sign.

On Mon, 06 Sep 2010 23:07:47 +0200, Greg KH <greg@kroah.com> wrote:
> So this returns more than one value?

Aren't thing like cpufreq governors listed in a single sysfs file?
I remember there was such a file somewhere.  Has that been made
deprecated? I cannot seem to find any information on that.

>> +		The "regions" directory list all reserved regions.
>
> Same here?

regions is actually a directory with subdirectories for each
region. ;)

>> +static ssize_t cma_sysfs_region_name_show(struct cma_region *reg, char *page)
>> +{
>> +	return reg->name ? snprintf(page, PAGE_SIZE, "%s\n", reg->name) : 0;
>> +}

> Is a name field ever really going to be bigger than a page?

I prefer being on the safe side -- I have no idea what user will provide
as region name so I assume as little as possible.  For numeric values you
are right that snprintf() is a bit paranoid, still I see no good reason
why not to use it.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Micha? "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFCv5 3/9] mm: cma: Added SysFS support
  2010-09-07  5:31       ` Michał Nazarewicz
  (?)
@ 2010-09-07  6:08         ` Greg KH
  -1 siblings, 0 replies; 59+ messages in thread
From: Greg KH @ 2010-09-07  6:08 UTC (permalink / raw)
  To: Micha?? Nazarewicz
  Cc: Hans Verkuil, Peter Zijlstra, Daniel Walker, Russell King,
	Jonathan Corbet, Mel Gorman, Pawel Osciak, Konrad Rzeszutek Wilk,
	linux-kernel, FUJITA Tomonori, linux-mm, Kyungmin Park,
	Minchan Kim, Zach Pfeffer, KAMEZAWA Hiroyuki, Andrew Morton,
	Marek Szyprowski, linux-arm-kernel, linux-media

On Tue, Sep 07, 2010 at 07:31:30AM +0200, Micha?? Nazarewicz wrote:
> Hello Greg,
>
> Thanks for reviewing the sysfs part.  Actually, I was never really sure
> if I shouldn't rather put this code to debugfs and you got me convinced
> that I should.  Sysfs somehow looked more appealing from kernel's API
> point of view -- things seem to be more organised in sysfs than in
> debugfs.  It seems I'll have to port it to debugfs after all

Yes, debugfs looks like a much better place for this.  You can do
whatever you want in debugfs as long as you follow the one rule for it:
	There are no rules for debugfs.

> Nonetheless, a few responses to your comments:
>
>> On Mon, Sep 06, 2010 at 08:33:53AM +0200, Michal Nazarewicz wrote:
>>> +		The "allocators" file list all registered allocators.
>>> +		Allocators with no name are listed as a single minus
>>> +		sign.
>
> On Mon, 06 Sep 2010 23:07:47 +0200, Greg KH <greg@kroah.com> wrote:
>> So this returns more than one value?
>
> Aren't thing like cpufreq governors listed in a single sysfs file?

Yeah, but I don't like it :)

> I remember there was such a file somewhere.  Has that been made
> deprecated? I cannot seem to find any information on that.

It's best if you really don't do this, but it does happen as it is the
best way to show the information.  If that's the case, fine.

>>> +		The "regions" directory list all reserved regions.
>>
>> Same here?
>
> regions is actually a directory with subdirectories for each
> region. ;)

Ah.

>>> +static ssize_t cma_sysfs_region_name_show(struct cma_region *reg, char *page)
>>> +{
>>> +	return reg->name ? snprintf(page, PAGE_SIZE, "%s\n", reg->name) : 0;
>>> +}
>
>> Is a name field ever really going to be bigger than a page?
>
> I prefer being on the safe side -- I have no idea what user will provide
> as region name so I assume as little as possible.

By "user" do you mean userspace, or another kernel driver file?  If it's
a kernel driver, you can assume that they will be sane.  if userspace,
assume it's insane and do some checking of the name before you use it.

> For numeric values you are right that snprintf() is a bit paranoid,
> still I see no good reason why not to use it.

Same goes for no good reason to use it :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFCv5 3/9] mm: cma: Added SysFS support
@ 2010-09-07  6:08         ` Greg KH
  0 siblings, 0 replies; 59+ messages in thread
From: Greg KH @ 2010-09-07  6:08 UTC (permalink / raw)
  To: Micha?? Nazarewicz
  Cc: Hans Verkuil, Peter Zijlstra, Daniel Walker, Russell King,
	Jonathan Corbet, Mel Gorman, Pawel Osciak, Konrad Rzeszutek Wilk,
	linux-kernel, FUJITA Tomonori, linux-mm, Kyungmin Park,
	Minchan Kim, Zach Pfeffer, KAMEZAWA Hiroyuki, Andrew Morton,
	Marek Szyprowski, linux-arm-kernel, linux-media

On Tue, Sep 07, 2010 at 07:31:30AM +0200, Micha?? Nazarewicz wrote:
> Hello Greg,
>
> Thanks for reviewing the sysfs part.  Actually, I was never really sure
> if I shouldn't rather put this code to debugfs and you got me convinced
> that I should.  Sysfs somehow looked more appealing from kernel's API
> point of view -- things seem to be more organised in sysfs than in
> debugfs.  It seems I'll have to port it to debugfs after all

Yes, debugfs looks like a much better place for this.  You can do
whatever you want in debugfs as long as you follow the one rule for it:
	There are no rules for debugfs.

> Nonetheless, a few responses to your comments:
>
>> On Mon, Sep 06, 2010 at 08:33:53AM +0200, Michal Nazarewicz wrote:
>>> +		The "allocators" file list all registered allocators.
>>> +		Allocators with no name are listed as a single minus
>>> +		sign.
>
> On Mon, 06 Sep 2010 23:07:47 +0200, Greg KH <greg@kroah.com> wrote:
>> So this returns more than one value?
>
> Aren't thing like cpufreq governors listed in a single sysfs file?

Yeah, but I don't like it :)

> I remember there was such a file somewhere.  Has that been made
> deprecated? I cannot seem to find any information on that.

It's best if you really don't do this, but it does happen as it is the
best way to show the information.  If that's the case, fine.

>>> +		The "regions" directory list all reserved regions.
>>
>> Same here?
>
> regions is actually a directory with subdirectories for each
> region. ;)

Ah.

>>> +static ssize_t cma_sysfs_region_name_show(struct cma_region *reg, char *page)
>>> +{
>>> +	return reg->name ? snprintf(page, PAGE_SIZE, "%s\n", reg->name) : 0;
>>> +}
>
>> Is a name field ever really going to be bigger than a page?
>
> I prefer being on the safe side -- I have no idea what user will provide
> as region name so I assume as little as possible.

By "user" do you mean userspace, or another kernel driver file?  If it's
a kernel driver, you can assume that they will be sane.  if userspace,
assume it's insane and do some checking of the name before you use it.

> For numeric values you are right that snprintf() is a bit paranoid,
> still I see no good reason why not to use it.

Same goes for no good reason to use it :)

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [RFCv5 3/9] mm: cma: Added SysFS support
@ 2010-09-07  6:08         ` Greg KH
  0 siblings, 0 replies; 59+ messages in thread
From: Greg KH @ 2010-09-07  6:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Sep 07, 2010 at 07:31:30AM +0200, Micha?? Nazarewicz wrote:
> Hello Greg,
>
> Thanks for reviewing the sysfs part.  Actually, I was never really sure
> if I shouldn't rather put this code to debugfs and you got me convinced
> that I should.  Sysfs somehow looked more appealing from kernel's API
> point of view -- things seem to be more organised in sysfs than in
> debugfs.  It seems I'll have to port it to debugfs after all

Yes, debugfs looks like a much better place for this.  You can do
whatever you want in debugfs as long as you follow the one rule for it:
	There are no rules for debugfs.

> Nonetheless, a few responses to your comments:
>
>> On Mon, Sep 06, 2010 at 08:33:53AM +0200, Michal Nazarewicz wrote:
>>> +		The "allocators" file list all registered allocators.
>>> +		Allocators with no name are listed as a single minus
>>> +		sign.
>
> On Mon, 06 Sep 2010 23:07:47 +0200, Greg KH <greg@kroah.com> wrote:
>> So this returns more than one value?
>
> Aren't thing like cpufreq governors listed in a single sysfs file?

Yeah, but I don't like it :)

> I remember there was such a file somewhere.  Has that been made
> deprecated? I cannot seem to find any information on that.

It's best if you really don't do this, but it does happen as it is the
best way to show the information.  If that's the case, fine.

>>> +		The "regions" directory list all reserved regions.
>>
>> Same here?
>
> regions is actually a directory with subdirectories for each
> region. ;)

Ah.

>>> +static ssize_t cma_sysfs_region_name_show(struct cma_region *reg, char *page)
>>> +{
>>> +	return reg->name ? snprintf(page, PAGE_SIZE, "%s\n", reg->name) : 0;
>>> +}
>
>> Is a name field ever really going to be bigger than a page?
>
> I prefer being on the safe side -- I have no idea what user will provide
> as region name so I assume as little as possible.

By "user" do you mean userspace, or another kernel driver file?  If it's
a kernel driver, you can assume that they will be sane.  if userspace,
assume it's insane and do some checking of the name before you use it.

> For numeric values you are right that snprintf() is a bit paranoid,
> still I see no good reason why not to use it.

Same goes for no good reason to use it :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFCv5 3/9] mm: cma: Added SysFS support
  2010-09-07  6:08         ` Greg KH
  (?)
@ 2010-09-07  6:55           ` Michał Nazarewicz
  -1 siblings, 0 replies; 59+ messages in thread
From: Michał Nazarewicz @ 2010-09-07  6:55 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-arm-kernel, FUJITA Tomonori, Daniel Walker, Russell King,
	Jonathan Corbet, Peter Zijlstra, Pawel Osciak,
	Konrad Rzeszutek Wilk, linux-kernel, Hans Verkuil, Mel Gorman,
	Kyungmin Park, Zach Pfeffer, Minchan Kim, Andrew Morton,
	linux-media, linux-mm, KAMEZAWA Hiroyuki, Marek Szyprowski

> On Tue, Sep 07, 2010 at 07:31:30AM +0200, Micha?? Nazarewicz wrote:
>> Thanks for reviewing the sysfs part.  Actually, I was never really sure
>> if I shouldn't rather put this code to debugfs and you got me convinced
>> that I should.  Sysfs somehow looked more appealing from kernel's API
>> point of view -- things seem to be more organised in sysfs than in
>> debugfs.  It seems I'll have to port it to debugfs after all

On Tue, 07 Sep 2010 08:08:18 +0200, Greg KH <greg@kroah.com> wrote:
> Yes, debugfs looks like a much better place for this.

I'll fix that in v6 then.

>>>> +static ssize_t cma_sysfs_region_name_show(struct cma_region *reg, char *page)
>>>> +{
>>>> +	return reg->name ? snprintf(page, PAGE_SIZE, "%s\n", reg->name) : 0;
>>>> +}

>>> Is a name field ever really going to be bigger than a page?

>> For numeric values you are right that snprintf() is a bit paranoid,
>> still I see no good reason why not to use it.

> Same goes for no good reason to use it :)

I somehow prefer to always use "safe" versions of the string manipulation
functions -- it's better to use it everywhere then to forget it in one
place.  Call to sprintf() is translated to vsnprintf() anyway so there's
no performance gain.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Michał "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFCv5 3/9] mm: cma: Added SysFS support
@ 2010-09-07  6:55           ` Michał Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michał Nazarewicz @ 2010-09-07  6:55 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-arm-kernel, FUJITA Tomonori, Daniel Walker, Russell King,
	Jonathan Corbet, Peter Zijlstra, Pawel Osciak,
	Konrad Rzeszutek Wilk, linux-kernel, Hans Verkuil, Mel Gorman,
	Kyungmin Park, Zach Pfeffer, Minchan Kim, Andrew Morton,
	linux-media, linux-mm, KAMEZAWA Hiroyuki, Marek Szyprowski

> On Tue, Sep 07, 2010 at 07:31:30AM +0200, Micha?? Nazarewicz wrote:
>> Thanks for reviewing the sysfs part.  Actually, I was never really sure
>> if I shouldn't rather put this code to debugfs and you got me convinced
>> that I should.  Sysfs somehow looked more appealing from kernel's API
>> point of view -- things seem to be more organised in sysfs than in
>> debugfs.  It seems I'll have to port it to debugfs after all

On Tue, 07 Sep 2010 08:08:18 +0200, Greg KH <greg@kroah.com> wrote:
> Yes, debugfs looks like a much better place for this.

I'll fix that in v6 then.

>>>> +static ssize_t cma_sysfs_region_name_show(struct cma_region *reg, char *page)
>>>> +{
>>>> +	return reg->name ? snprintf(page, PAGE_SIZE, "%s\n", reg->name) : 0;
>>>> +}

>>> Is a name field ever really going to be bigger than a page?

>> For numeric values you are right that snprintf() is a bit paranoid,
>> still I see no good reason why not to use it.

> Same goes for no good reason to use it :)

I somehow prefer to always use "safe" versions of the string manipulation
functions -- it's better to use it everywhere then to forget it in one
place.  Call to sprintf() is translated to vsnprintf() anyway so there's
no performance gain.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Michał "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [RFCv5 3/9] mm: cma: Added SysFS support
@ 2010-09-07  6:55           ` Michał Nazarewicz
  0 siblings, 0 replies; 59+ messages in thread
From: Michał Nazarewicz @ 2010-09-07  6:55 UTC (permalink / raw)
  To: linux-arm-kernel

> On Tue, Sep 07, 2010 at 07:31:30AM +0200, Micha?? Nazarewicz wrote:
>> Thanks for reviewing the sysfs part.  Actually, I was never really sure
>> if I shouldn't rather put this code to debugfs and you got me convinced
>> that I should.  Sysfs somehow looked more appealing from kernel's API
>> point of view -- things seem to be more organised in sysfs than in
>> debugfs.  It seems I'll have to port it to debugfs after all

On Tue, 07 Sep 2010 08:08:18 +0200, Greg KH <greg@kroah.com> wrote:
> Yes, debugfs looks like a much better place for this.

I'll fix that in v6 then.

>>>> +static ssize_t cma_sysfs_region_name_show(struct cma_region *reg, char *page)
>>>> +{
>>>> +	return reg->name ? snprintf(page, PAGE_SIZE, "%s\n", reg->name) : 0;
>>>> +}

>>> Is a name field ever really going to be bigger than a page?

>> For numeric values you are right that snprintf() is a bit paranoid,
>> still I see no good reason why not to use it.

> Same goes for no good reason to use it :)

I somehow prefer to always use "safe" versions of the string manipulation
functions -- it's better to use it everywhere then to forget it in one
place.  Call to sprintf() is translated to vsnprintf() anyway so there's
no performance gain.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Micha? "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFCv5 7/9] mm: vcm: Virtual Contiguous Memory framework added
  2010-09-06  6:33   ` Michal Nazarewicz
@ 2010-09-21 16:13     ` Konrad Rzeszutek Wilk
  -1 siblings, 0 replies; 59+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-09-21 16:13 UTC (permalink / raw)
  To: Michal Nazarewicz
  Cc: linux-arm-kernel, linux-media, linux-mm, Andrew Morton,
	Daniel Walker, FUJITA Tomonori, Hans Verkuil, Jonathan Corbet,
	KAMEZAWA Hiroyuki, Kyungmin Park, Marek Szyprowski, Mel Gorman,
	Minchan Kim, Pawel Osciak, Peter Zijlstra, Russell King,
	Zach Pfeffer, linux-kernel

> +* The Virtual Contiguous Memory Manager
> +
> +The VCMM was built to solve the system-wide memory mapping issues that
> +occur when many bus-masters have IOMMUs.
> +
> +An IOMMU maps device addresses to physical addresses.  It also
> +insulates the system from spurious or malicious device bus
> +transactions and allows fine-grained mapping attribute control.  The
> +Linux kernel core does not contain a generic API to handle IOMMU
> +mapped memory; device driver writers must implement device specific
> +code to interoperate with the Linux kernel core.  As the number of
> +IOMMUs increases, coordinating the many address spaces mapped by all
> +discrete IOMMUs becomes difficult without in-kernel support.

Looking at the set of calls and the examples it struck me as similar
to the agp.h API (drivers/char/agp/). It has allocate, bind, de-allocate.
Naturally it has no bus device mapping, but the DRM code that utilizes
the AGP API bridge has that: drivers/gpu/drm/ati_pcigart.c (DRM API).

Then there are the radeon and nouveau drivers that program the GPU GART
bypassing the AGP API but still utilize the DMA API.

The nice ASCII art you included in your writeup looks to cover those
use cases.

What I am ineptly trying to say is that is that we have a bunch
of APIs that do this, and in case where they are inadequate (or just
look to be a one-off solution) we have functions that are similar in API
view but differ in implementation (check out how the Nouveau programs
its VMM compared to how the Radeon does it).

Your API offers a way to unify all of that, but it looks to be
an API on top of the other ones. You would still have to implement
different mechanisms for the utilizing this say on the radeon driver:
AGP API, or the home-grown GPU GART programming, and then the DMA API
wrapped around them all. Oh, and the DMA API sits on top of the IOMMU API.

I am not sure how this would solve the proliferation of different APIs
- it sounds like it just adds another piece where you still have to
shoe-in the other APIs in.

But I do understand the problem you are facing. You want to switch
to different IOMMUs for different drivers using only one API. Folks have
been asking about whether it makes sense to include your algorithms in
expanding the memory allocator to have huge chunks of memory reserved
for specific drivers. But that does not solve the IOMMU problem.

So my question is, would it perhaps make sense to concentrate on the DMA API?
Could we expand it so that for specific devices it sets the DMA API
to use a different IOMMU? If you look at the Calgary IOMMU - that is a perfect
example of your problem  - it is only used if the specific devices which fall
within its control - all other DMA operations are utilized by the SWIOTLB
(the default IOMMU).

Would it possible to do something similar to that so when CMA is activated
it scans the region list, finds which devices can share the same memory region
and sets the struct device DMA API to point to a CMA IOMMU which is happy
to utilize the memory allocator reserved chunks of memory (that would
be the code lifted from your CMA and stuck in the memory allocate)
and allowing different drivers (those on the whitelist) to share the same region?

This has the extra benefit that it would inclusive allow all drivers
that utilize the DMA API to use this without any extra VCMA/CMA API calls?

P.S.
I am quite X86 specific here - I don't know much about ARM (is there some
desktop box I can buy with it?), so I am probably missing some details.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFCv5 7/9] mm: vcm: Virtual Contiguous Memory framework added
@ 2010-09-21 16:13     ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 59+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-09-21 16:13 UTC (permalink / raw)
  To: Michal Nazarewicz
  Cc: linux-arm-kernel, linux-media, linux-mm, Andrew Morton,
	Daniel Walker, FUJITA Tomonori, Hans Verkuil, Jonathan Corbet,
	KAMEZAWA Hiroyuki, Kyungmin Park, Marek Szyprowski, Mel Gorman,
	Minchan Kim, Pawel Osciak, Peter Zijlstra, Russell King,
	Zach Pfeffer, linux-kernel

> +* The Virtual Contiguous Memory Manager
> +
> +The VCMM was built to solve the system-wide memory mapping issues that
> +occur when many bus-masters have IOMMUs.
> +
> +An IOMMU maps device addresses to physical addresses.  It also
> +insulates the system from spurious or malicious device bus
> +transactions and allows fine-grained mapping attribute control.  The
> +Linux kernel core does not contain a generic API to handle IOMMU
> +mapped memory; device driver writers must implement device specific
> +code to interoperate with the Linux kernel core.  As the number of
> +IOMMUs increases, coordinating the many address spaces mapped by all
> +discrete IOMMUs becomes difficult without in-kernel support.

Looking at the set of calls and the examples it struck me as similar
to the agp.h API (drivers/char/agp/). It has allocate, bind, de-allocate.
Naturally it has no bus device mapping, but the DRM code that utilizes
the AGP API bridge has that: drivers/gpu/drm/ati_pcigart.c (DRM API).

Then there are the radeon and nouveau drivers that program the GPU GART
bypassing the AGP API but still utilize the DMA API.

The nice ASCII art you included in your writeup looks to cover those
use cases.

What I am ineptly trying to say is that is that we have a bunch
of APIs that do this, and in case where they are inadequate (or just
look to be a one-off solution) we have functions that are similar in API
view but differ in implementation (check out how the Nouveau programs
its VMM compared to how the Radeon does it).

Your API offers a way to unify all of that, but it looks to be
an API on top of the other ones. You would still have to implement
different mechanisms for the utilizing this say on the radeon driver:
AGP API, or the home-grown GPU GART programming, and then the DMA API
wrapped around them all. Oh, and the DMA API sits on top of the IOMMU API.

I am not sure how this would solve the proliferation of different APIs
- it sounds like it just adds another piece where you still have to
shoe-in the other APIs in.

But I do understand the problem you are facing. You want to switch
to different IOMMUs for different drivers using only one API. Folks have
been asking about whether it makes sense to include your algorithms in
expanding the memory allocator to have huge chunks of memory reserved
for specific drivers. But that does not solve the IOMMU problem.

So my question is, would it perhaps make sense to concentrate on the DMA API?
Could we expand it so that for specific devices it sets the DMA API
to use a different IOMMU? If you look at the Calgary IOMMU - that is a perfect
example of your problem  - it is only used if the specific devices which fall
within its control - all other DMA operations are utilized by the SWIOTLB
(the default IOMMU).

Would it possible to do something similar to that so when CMA is activated
it scans the region list, finds which devices can share the same memory region
and sets the struct device DMA API to point to a CMA IOMMU which is happy
to utilize the memory allocator reserved chunks of memory (that would
be the code lifted from your CMA and stuck in the memory allocate)
and allowing different drivers (those on the whitelist) to share the same region?

This has the extra benefit that it would inclusive allow all drivers
that utilize the DMA API to use this without any extra VCMA/CMA API calls?

P.S.
I am quite X86 specific here - I don't know much about ARM (is there some
desktop box I can buy with it?), so I am probably missing some details.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2010-09-21 16:18 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-06  6:33 [RFCv5 0/9] CMA + VCMM integration Michal Nazarewicz
2010-09-06  6:33 ` Michal Nazarewicz
2010-09-06  6:33 ` Michal Nazarewicz
2010-09-06  6:33 ` [RFCv5 1/9] lib: rbtree: rb_root_init() function added Michal Nazarewicz
2010-09-06  6:33   ` Michal Nazarewicz
2010-09-06  6:33   ` Michal Nazarewicz
2010-09-06  6:33 ` [RFCv5 2/9] mm: cma: Contiguous Memory Allocator added Michal Nazarewicz
2010-09-06  6:33   ` Michal Nazarewicz
2010-09-06  6:33   ` Michal Nazarewicz
2010-09-06  6:33 ` [RFCv5 3/9] mm: cma: Added SysFS support Michal Nazarewicz
2010-09-06  6:33   ` Michal Nazarewicz
2010-09-06  6:33   ` Michal Nazarewicz
2010-09-06 21:07   ` Greg KH
2010-09-06 21:07     ` Greg KH
2010-09-06 21:07     ` Greg KH
2010-09-07  5:31     ` Michał Nazarewicz
2010-09-07  5:31       ` Michał Nazarewicz
2010-09-07  5:31       ` Michał Nazarewicz
2010-09-07  6:08       ` Greg KH
2010-09-07  6:08         ` Greg KH
2010-09-07  6:08         ` Greg KH
2010-09-07  6:55         ` Michał Nazarewicz
2010-09-07  6:55           ` Michał Nazarewicz
2010-09-07  6:55           ` Michał Nazarewicz
2010-09-06  6:33 ` [RFCv5 4/9] mm: cma: Added command line parameters support Michal Nazarewicz
2010-09-06  6:33   ` Michal Nazarewicz
2010-09-06  6:33   ` Michal Nazarewicz
2010-09-06  6:33 ` [RFCv5 5/9] mm: cma: Test device and application added Michal Nazarewicz
2010-09-06  6:33   ` Michal Nazarewicz
2010-09-06  6:33   ` Michal Nazarewicz
2010-09-06  6:33 ` [RFCv5 6/9] ARM: cma: Added CMA to Aquila, Goni and c210 universal boards Michal Nazarewicz
2010-09-06  6:33   ` Michal Nazarewicz
2010-09-06  6:33   ` Michal Nazarewicz
2010-09-06  6:33 ` [RFCv5 7/9] mm: vcm: Virtual Contiguous Memory framework added Michal Nazarewicz
2010-09-06  6:33   ` Michal Nazarewicz
2010-09-06  6:33   ` Michal Nazarewicz
2010-09-21 16:13   ` Konrad Rzeszutek Wilk
2010-09-21 16:13     ` Konrad Rzeszutek Wilk
2010-09-06  6:33 ` [RFCv5 8/9] mm: vcm: Sample driver added Michal Nazarewicz
2010-09-06  6:33   ` Michal Nazarewicz
2010-09-06  6:33   ` Michal Nazarewicz
2010-09-06 21:10   ` Greg KH
2010-09-06 21:10     ` Greg KH
2010-09-06 21:10     ` Greg KH
2010-09-07  1:58     ` Michał Nazarewicz
2010-09-07  1:58       ` Michał Nazarewicz
2010-09-07  1:58       ` Michał Nazarewicz
2010-09-06  6:33 ` [RFCv5 9/9] mm: vcm: vcm-cma: VCM CMA " Michal Nazarewicz
2010-09-06  6:33   ` Michal Nazarewicz
2010-09-06  6:33   ` Michal Nazarewicz
2010-09-06 21:09 ` [RFCv5 0/9] CMA + VCMM integration Greg KH
2010-09-06 21:09   ` Greg KH
2010-09-06 21:09   ` Greg KH
2010-09-07  1:40   ` Michał Nazarewicz
2010-09-07  1:40     ` Michał Nazarewicz
2010-09-07  1:40     ` Michał Nazarewicz
2010-09-07  2:34     ` Greg KH
2010-09-07  2:34       ` Greg KH
2010-09-07  2:34       ` Greg KH

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.