All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2 0/4] The Contiguous Memory Allocator
@ 2010-07-26 14:11 ` Michal Nazarewicz
  0 siblings, 0 replies; 57+ messages in thread
From: Michal Nazarewicz @ 2010-07-26 14:11 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-media, linux-arm-kernel, Hiremath Vaibhav,
	Marek Szyprowski, Pawel Osciak, Mark Brown, Daniel Walker,
	Jonathan Corbet, FUJITA Tomonori, Zach Pfeffer, Kyungmin Park,
	Michal Nazarewicz

Hello everyone,

The following patchset implements a Contiguous Memory Allocator.  For
those who have not yet stumbled across CMA an excerpt from
documentation:

   The Contiguous Memory Allocator (CMA) is a framework, which allows
   setting up a machine-specific configuration for physically-contiguous
   memory management. Memory for devices is then allocated according
   to that configuration.

   The main role of the framework is not to allocate memory, but to
   parse and manage memory configurations, as well as to act as an
   in-between between device drivers and pluggable allocators. It is
   thus not tied to any memory allocation method or strategy.

For more information please refer to the second patch from the
patchset which contains the documentation.


This is the second version of the patchset.  All of the changes are
concentrated in the second patch -- the other patches are almost
identical.

Major observable changes are:

1. The "cma_map" command line have been removed.  In exchange, a SysFS
   entry has been created under kernel/mm/contiguous.
   
   The configuration strings passed to CMA are now called attributes
   in the documentation.

   The intended way of specifying the attributes is
   a cma_set_defaults() function called by platform initialisation
   code.  "regions" attribute (the string specified by "cma" command
   line parameter) can be overwritten with command line parameter; the
   other attributes can be changed during run-time using the SysFS
   entries.

   (I still believe that the other attributes should have their own
   command line arguments as well but since they posed a lot of
   controversy (and many stopped reading after encountering them)
   "cma_map" have been removed.)

2. The behaviour of the "map" attribute has been modified slightly.
   Currently, if no rule matches given device it is assigned regions
   specified by the "asterisk" attribute.  It is by default built from
   the region names given in "regions" attribute.

   This also means that if no "map" is specified all devices use all
   the regions specified in the "regions" attribute.  This should be
   a handy default.

3. Devices can register private regions as well as regions that can be
   shared but are not reserved using standard CMA mechanisms.
   A private region has no name and can be accessed only by devices
   that have the pointer to it.

   Moreover, if device manages to run its code early enough it can
   register an "early region".  An early region is one memory has not
   been reserved for.  At one point, platform initialisation code
   reserves memory for all registered early regions and if this
   succeeds those regions are registered as normal regions that can be
   used with the standard API.  This may be handy for devices that
   need some private region but don't want to worry about reserving
   it.

4. The way allocators are registered has changed.  Currently,
   a cma_allocator_register() function is used for that purpose.
   Moreover, allocators are attached to regions the first time memory
   is registered from the region or when allocator is registered which
   means that allocators can be dynamic modules that are loaded after
   the kernel booted (of course, it won't be possible to allocate
   a chunk of memory from a region if allocator is not loaded).


Index of new functions:

+static inline dma_addr_t __must_check
+cma_alloc_from(const char *regions, size_t size, dma_addr_t alignment)

+static inline int
+cma_info_about(struct cma_info *info, const const char *regions)

+int __must_check cma_region_register(struct cma_region *reg);

+dma_addr_t __must_check
+cma_alloc_from_region(struct cma_region *reg,
+		      size_t size, dma_addr_t alignment);

+static inline dma_addr_t __must_check
+cma_alloc_from(const char *regions,
+               size_t size, dma_addr_t alignment);

+int cma_allocator_register(struct cma_allocator *alloc);


The patches in the patchset include:

Michal Nazarewicz (4):
  lib: rbtree: rb_root_init() function added

    The rb_root_init() function initialises an RB tree with a single
    node placed in the root.  This is more convenient then
    initialising an empty tree and then adding an element.

  mm: cma: Contiguous Memory Allocator added

    This patch is the main patchset that implements the CMA framework
    including the best-fit allocator.  It also adds a documentation.

  mm: cma: Test device and application added

    This patch adds a misc device that works as a proxy to the CMA
    framework and a simple testing application.  This lets one test
    the whole framework from user space as well as reply an recorded
    allocate/free sequence.

  arm: Added CMA to Aquila and Goni

    This patch adds the CMA platform initialisation code to two ARM
    platforms.  It serves as an example of how this is achieved.

 Documentation/00-INDEX                             |    2 +
 .../ABI/testing/sysfs-kernel-mm-contiguous         |    9 +
 Documentation/contiguous-memory.txt                |  646 +++++++++++
 Documentation/kernel-parameters.txt                |    4 +
 arch/arm/mach-s5pv210/mach-aquila.c                |   13 +
 arch/arm/mach-s5pv210/mach-goni.c                  |   13 +
 drivers/misc/Kconfig                               |    8 +
 drivers/misc/Makefile                              |    1 +
 drivers/misc/cma-dev.c                             |  184 +++
 include/linux/cma.h                                |  475 ++++++++
 include/linux/rbtree.h                             |   11 +
 mm/Kconfig                                         |   34 +
 mm/Makefile                                        |    3 +
 mm/cma-best-fit.c                                  |  407 +++++++
 mm/cma.c                                           | 1170 ++++++++++++++++++++
 tools/cma/cma-test.c                               |  373 +++++++
 16 files changed, 3353 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous
 create mode 100644 Documentation/contiguous-memory.txt
 create mode 100644 drivers/misc/cma-dev.c
 create mode 100644 include/linux/cma.h
 create mode 100644 mm/cma-best-fit.c
 create mode 100644 mm/cma.c
 create mode 100644 tools/cma/cma-test.c


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCHv2 0/4] The Contiguous Memory Allocator
@ 2010-07-26 14:11 ` Michal Nazarewicz
  0 siblings, 0 replies; 57+ messages in thread
From: Michal Nazarewicz @ 2010-07-26 14:11 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-media, linux-arm-kernel, Hiremath Vaibhav,
	Marek Szyprowski, Pawel Osciak, Mark Brown, Daniel Walker,
	Jonathan Corbet, FUJITA Tomonori, Zach Pfeffer, Kyungmin Park,
	Michal Nazarewicz

Hello everyone,

The following patchset implements a Contiguous Memory Allocator.  For
those who have not yet stumbled across CMA an excerpt from
documentation:

   The Contiguous Memory Allocator (CMA) is a framework, which allows
   setting up a machine-specific configuration for physically-contiguous
   memory management. Memory for devices is then allocated according
   to that configuration.

   The main role of the framework is not to allocate memory, but to
   parse and manage memory configurations, as well as to act as an
   in-between between device drivers and pluggable allocators. It is
   thus not tied to any memory allocation method or strategy.

For more information please refer to the second patch from the
patchset which contains the documentation.


This is the second version of the patchset.  All of the changes are
concentrated in the second patch -- the other patches are almost
identical.

Major observable changes are:

1. The "cma_map" command line have been removed.  In exchange, a SysFS
   entry has been created under kernel/mm/contiguous.
   
   The configuration strings passed to CMA are now called attributes
   in the documentation.

   The intended way of specifying the attributes is
   a cma_set_defaults() function called by platform initialisation
   code.  "regions" attribute (the string specified by "cma" command
   line parameter) can be overwritten with command line parameter; the
   other attributes can be changed during run-time using the SysFS
   entries.

   (I still believe that the other attributes should have their own
   command line arguments as well but since they posed a lot of
   controversy (and many stopped reading after encountering them)
   "cma_map" have been removed.)

2. The behaviour of the "map" attribute has been modified slightly.
   Currently, if no rule matches given device it is assigned regions
   specified by the "asterisk" attribute.  It is by default built from
   the region names given in "regions" attribute.

   This also means that if no "map" is specified all devices use all
   the regions specified in the "regions" attribute.  This should be
   a handy default.

3. Devices can register private regions as well as regions that can be
   shared but are not reserved using standard CMA mechanisms.
   A private region has no name and can be accessed only by devices
   that have the pointer to it.

   Moreover, if device manages to run its code early enough it can
   register an "early region".  An early region is one memory has not
   been reserved for.  At one point, platform initialisation code
   reserves memory for all registered early regions and if this
   succeeds those regions are registered as normal regions that can be
   used with the standard API.  This may be handy for devices that
   need some private region but don't want to worry about reserving
   it.

4. The way allocators are registered has changed.  Currently,
   a cma_allocator_register() function is used for that purpose.
   Moreover, allocators are attached to regions the first time memory
   is registered from the region or when allocator is registered which
   means that allocators can be dynamic modules that are loaded after
   the kernel booted (of course, it won't be possible to allocate
   a chunk of memory from a region if allocator is not loaded).


Index of new functions:

+static inline dma_addr_t __must_check
+cma_alloc_from(const char *regions, size_t size, dma_addr_t alignment)

+static inline int
+cma_info_about(struct cma_info *info, const const char *regions)

+int __must_check cma_region_register(struct cma_region *reg);

+dma_addr_t __must_check
+cma_alloc_from_region(struct cma_region *reg,
+		      size_t size, dma_addr_t alignment);

+static inline dma_addr_t __must_check
+cma_alloc_from(const char *regions,
+               size_t size, dma_addr_t alignment);

+int cma_allocator_register(struct cma_allocator *alloc);


The patches in the patchset include:

Michal Nazarewicz (4):
  lib: rbtree: rb_root_init() function added

    The rb_root_init() function initialises an RB tree with a single
    node placed in the root.  This is more convenient then
    initialising an empty tree and then adding an element.

  mm: cma: Contiguous Memory Allocator added

    This patch is the main patchset that implements the CMA framework
    including the best-fit allocator.  It also adds a documentation.

  mm: cma: Test device and application added

    This patch adds a misc device that works as a proxy to the CMA
    framework and a simple testing application.  This lets one test
    the whole framework from user space as well as reply an recorded
    allocate/free sequence.

  arm: Added CMA to Aquila and Goni

    This patch adds the CMA platform initialisation code to two ARM
    platforms.  It serves as an example of how this is achieved.

 Documentation/00-INDEX                             |    2 +
 .../ABI/testing/sysfs-kernel-mm-contiguous         |    9 +
 Documentation/contiguous-memory.txt                |  646 +++++++++++
 Documentation/kernel-parameters.txt                |    4 +
 arch/arm/mach-s5pv210/mach-aquila.c                |   13 +
 arch/arm/mach-s5pv210/mach-goni.c                  |   13 +
 drivers/misc/Kconfig                               |    8 +
 drivers/misc/Makefile                              |    1 +
 drivers/misc/cma-dev.c                             |  184 +++
 include/linux/cma.h                                |  475 ++++++++
 include/linux/rbtree.h                             |   11 +
 mm/Kconfig                                         |   34 +
 mm/Makefile                                        |    3 +
 mm/cma-best-fit.c                                  |  407 +++++++
 mm/cma.c                                           | 1170 ++++++++++++++++++++
 tools/cma/cma-test.c                               |  373 +++++++
 16 files changed, 3353 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous
 create mode 100644 Documentation/contiguous-memory.txt
 create mode 100644 drivers/misc/cma-dev.c
 create mode 100644 include/linux/cma.h
 create mode 100644 mm/cma-best-fit.c
 create mode 100644 mm/cma.c
 create mode 100644 tools/cma/cma-test.c

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCHv2 1/4] lib: rbtree: rb_root_init() function added
  2010-07-26 14:11 ` Michal Nazarewicz
  (?)
@ 2010-07-26 14:11   ` Michal Nazarewicz
  -1 siblings, 0 replies; 57+ messages in thread
From: Michal Nazarewicz @ 2010-07-26 14:11 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-media, linux-arm-kernel, Hiremath Vaibhav,
	Marek Szyprowski, Pawel Osciak, Mark Brown, Daniel Walker,
	Jonathan Corbet, FUJITA Tomonori, Zach Pfeffer, Kyungmin Park,
	Michal Nazarewicz

Added a rb_root_init() function which initialises a rb_root
structure as a red-black tree with at most one element.  The
rationale is that using rb_root_init(root, node) is more
straightforward and cleaner then first initialising and
empty tree followed by an insert operation.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 include/linux/rbtree.h |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index 7066acb..5b6dc66 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -130,6 +130,17 @@ static inline void rb_set_color(struct rb_node *rb, int color)
 }
 
 #define RB_ROOT	(struct rb_root) { NULL, }
+
+static inline void rb_root_init(struct rb_root *root, struct rb_node *node)
+{
+	root->rb_node = node;
+	if (node) {
+		node->rb_parent_color = RB_BLACK; /* black, no parent */
+		node->rb_left  = NULL;
+		node->rb_right = NULL;
+	}
+}
+
 #define	rb_entry(ptr, type, member) container_of(ptr, type, member)
 
 #define RB_EMPTY_ROOT(root)	((root)->rb_node == NULL)
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCHv2 1/4] lib: rbtree: rb_root_init() function added
@ 2010-07-26 14:11   ` Michal Nazarewicz
  0 siblings, 0 replies; 57+ messages in thread
From: Michal Nazarewicz @ 2010-07-26 14:11 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-media, linux-arm-kernel, Hiremath Vaibhav,
	Marek Szyprowski, Pawel Osciak, Mark Brown, Daniel Walker,
	Jonathan Corbet, FUJITA Tomonori, Zach Pfeffer, Kyungmin Park,
	Michal Nazarewicz

Added a rb_root_init() function which initialises a rb_root
structure as a red-black tree with at most one element.  The
rationale is that using rb_root_init(root, node) is more
straightforward and cleaner then first initialising and
empty tree followed by an insert operation.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 include/linux/rbtree.h |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index 7066acb..5b6dc66 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -130,6 +130,17 @@ static inline void rb_set_color(struct rb_node *rb, int color)
 }
 
 #define RB_ROOT	(struct rb_root) { NULL, }
+
+static inline void rb_root_init(struct rb_root *root, struct rb_node *node)
+{
+	root->rb_node = node;
+	if (node) {
+		node->rb_parent_color = RB_BLACK; /* black, no parent */
+		node->rb_left  = NULL;
+		node->rb_right = NULL;
+	}
+}
+
 #define	rb_entry(ptr, type, member) container_of(ptr, type, member)
 
 #define RB_EMPTY_ROOT(root)	((root)->rb_node == NULL)
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
  2010-07-26 14:11   ` Michal Nazarewicz
  (?)
@ 2010-07-26 14:11     ` Michal Nazarewicz
  -1 siblings, 0 replies; 57+ messages in thread
From: Michal Nazarewicz @ 2010-07-26 14:11 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-media, linux-arm-kernel, Hiremath Vaibhav,
	Marek Szyprowski, Pawel Osciak, Mark Brown, Daniel Walker,
	Jonathan Corbet, FUJITA Tomonori, Zach Pfeffer, Kyungmin Park,
	Michal Nazarewicz

The Contiguous Memory Allocator framework is a set of APIs for
allocating physically contiguous chunks of memory.

Various chips require contiguous blocks of memory to operate.  Those
chips include devices such as cameras, hardware video decoders and
encoders, etc.

The code is highly modular and customisable to suit the needs of
various users.  Set of regions reserved for CMA can be configured on
run-time and it is easy to add custom allocator algorithms if one
has such need.

For more details see Documentation/contiguous-memory.txt.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Pawel Osciak <p.osciak@samsung.com>
---
 Documentation/00-INDEX                             |    2 +
 .../ABI/testing/sysfs-kernel-mm-contiguous         |    9 +
 Documentation/contiguous-memory.txt                |  646 +++++++++++
 Documentation/kernel-parameters.txt                |    4 +
 include/linux/cma.h                                |  445 ++++++++
 mm/Kconfig                                         |   34 +
 mm/Makefile                                        |    3 +
 mm/cma-best-fit.c                                  |  407 +++++++
 mm/cma.c                                           | 1170 ++++++++++++++++++++
 9 files changed, 2720 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous
 create mode 100644 Documentation/contiguous-memory.txt
 create mode 100644 include/linux/cma.h
 create mode 100644 mm/cma-best-fit.c
 create mode 100644 mm/cma.c

diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
index 5405f7a..bb50209 100644
--- a/Documentation/00-INDEX
+++ b/Documentation/00-INDEX
@@ -94,6 +94,8 @@ connector/
 	- docs on the netlink based userspace<->kernel space communication mod.
 console/
 	- documentation on Linux console drivers.
+contiguous-memory.txt
+	- documentation on physically-contiguous memory allocation framework.
 cpu-freq/
 	- info on CPU frequency and voltage scaling.
 cpu-hotplug.txt
diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-contiguous b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
new file mode 100644
index 0000000..05e2f6a
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
@@ -0,0 +1,9 @@
+What:		/sys/kernel/mm/contiguous/
+Date:		July 2008
+Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
+Description:
+		/sys/kernel/mm/contiguous/ contains two files: asterisk and
+		map.  They are used to configure the Contiguous Memory
+		Allocator framework.
+
+		For details see Documentation/contiguous-memory.txt.
diff --git a/Documentation/contiguous-memory.txt b/Documentation/contiguous-memory.txt
new file mode 100644
index 0000000..6eb1295
--- /dev/null
+++ b/Documentation/contiguous-memory.txt
@@ -0,0 +1,646 @@
+                                                             -*- org -*-
+
+* Contiguous Memory Allocator
+
+   The Contiguous Memory Allocator (CMA) is a framework, which allows
+   setting up a machine-specific configuration for physically-contiguous
+   memory management. Memory for devices is then allocated according
+   to that configuration.
+
+   The main role of the framework is not to allocate memory, but to
+   parse and manage memory configurations, as well as to act as an
+   in-between between device drivers and pluggable allocators. It is
+   thus not tied to any memory allocation method or strategy.
+
+** Why is it needed?
+
+    Various devices on embedded systems have no scatter-getter and/or
+    IO map support and as such require contiguous blocks of memory to
+    operate.  They include devices such as cameras, hardware video
+    decoders and encoders, etc.
+
+    Such devices often require big memory buffers (a full HD frame is,
+    for instance, more then 2 mega pixels large, i.e. more than 6 MB
+    of memory), which makes mechanisms such as kmalloc() ineffective.
+
+    Some embedded devices impose additional requirements on the
+    buffers, e.g. they can operate only on buffers allocated in
+    particular location/memory bank (if system has more than one
+    memory bank) or buffers aligned to a particular memory boundary.
+
+    Development of embedded devices have seen a big rise recently
+    (especially in the V4L area) and many such drivers include their
+    own memory allocation code. Most of them use bootmem-based methods.
+    CMA framework is an attempt to unify contiguous memory allocation
+    mechanisms and provide a simple API for device drivers, while
+    staying as customisable and modular as possible.
+
+** Design
+
+    The main design goal for the CMA was to provide a customisable and
+    modular framework, which could be configured to suit the needs of
+    individual systems.  Configuration specifies a list of memory
+    regions, which then are assigned to devices.  Memory regions can
+    be shared among many device drivers or assigned exclusively to
+    one.  This has been achieved in the following ways:
+
+    1. The core of the CMA does not handle allocation of memory and
+       management of free space.  Dedicated allocators are used for
+       that purpose.
+
+       This way, if the provided solution does not match demands
+       imposed on a given system, one can develop a new algorithm and
+       easily plug it into the CMA framework.
+
+       The presented solution includes an implementation of a best-fit
+       algorithm.
+
+    2. CMA allows a run-time configuration of the memory regions it
+       will use to allocate chunks of memory from.  The set of memory
+       regions is given on command line so it can be easily changed
+       without the need for recompiling the kernel.
+
+       Each region has it's own size, alignment demand, a start
+       address (physical address where it should be placed) and an
+       allocator algorithm assigned to the region.
+
+       This means that there can be different algorithms running at
+       the same time, if different devices on the platform have
+       distinct memory usage characteristics and different algorithm
+       match those the best way.
+
+    3. When requesting memory, devices have to introduce themselves.
+       This way CMA knows who the memory is allocated for.  This
+       allows for the system architect to specify which memory regions
+       each device should use.
+
+       3a. Devices can also specify a "kind" of memory they want.
+           This makes it possible to configure the system in such
+           a way, that a single device may get memory from different
+           memory regions, depending on the "kind" of memory it
+           requested.  For example, a video codec driver might want to
+           allocate some shared buffers from the first memory bank and
+           the other from the second to get the highest possible
+           memory throughput.
+
+    4. For greater flexibility and extensibility, the framework allows
+       device drivers to register private regions of reserved memory
+       which then may be used only by them.
+
+       As an effect, if a driver would not use the rest of the CMA
+       interface, it can still use CMA allocators and other
+       mechanisms.
+
+       4a. Early in boot process, device drivers can also request the
+           CMA framework to a reserve a region of memory for them
+           which then will be used as a private region.
+
+           This way, drivers do not need to directly call bootmem,
+           memblock or similar early allocator but merely register an
+           early region and the framework will handle the rest
+           including choosing the right early allocator.
+
+** Use cases
+
+    Lets analyse some imaginary system that uses the CMA to see how
+    the framework can be used and configured.
+
+
+    We have a platform with a hardware video decoder and a camera each
+    needing 20 MiB of memory in worst case.  Our system is written in
+    such a way though that the two devices are never used at the same
+    time and memory for them may be shared.  In such a system the
+    following two configuration strings would be used:
+
+        static char regions[]  __initdata = "region=20M";
+        static char map[]      __initdata = "video,camera=region";
+
+        cma_set_defaults(regions, map, NULL);
+
+    .regions instructs CMA to allocate a region of 20 MiB and use the
+    first available memory allocator on it.  .map says that drivers
+    named "video" and "camera" are to be granted memory from the
+    previously defined region.
+
+    This can in fact be written in simpler way:
+
+        static       char regions[]  __initdata = "region=20M";
+
+        cma_set_defaults(regions, NULL, NULL);
+
+    By omitting the .map attribute, we say that all drivers are to use
+    all the regions defined with the first argument.  Generally, if
+    a device/kind pair is not matched by any rule from the .map it is
+    granted access to all regions from the .regions attribute.
+
+    We can see, that because the devices share the same region of
+    memory, we save 20 MiB of memory, compared to the situation when
+    each of the devices would reserve 20 MiB of memory for itself.
+
+
+    Now, let say that we have also many other smaller devices and we
+    want them to share some smaller pool of memory.  For instance 5
+    MiB.  This can be achieved in several ways:
+
+        static char regions[]  __initdata = "region=20M;common=5M";
+        static char map[]      __initdata =
+                "video,camera=region;*/*=common";
+
+        cma_set_defaults(regions, map, NULL);
+
+    This instructs CMA to reserve two regions and let video and camera
+    use region "region" whereas all other devices should use region
+    "common".  Alternatively, we might say:
+
+        static char regions[]  __initdata = "region=20M;common=5M";
+        static char map[]      __initdata = "video,camera=region";
+        static char asterisk[] __initdata = "common";
+
+        cma_set_defaults(regions, map, asterisk);
+
+    It works in similar fashion, expect by specifying the .asterisk
+    attribute, we say that devices not matched by any rule from .map
+    should access regions listed in .asterisk attribute.
+
+
+    Later on, after some development of the system, it can now run
+    video decoder and camera at the same time.  The 20 MiB region is
+    no longer enough for the two to share.  A quick fix can be made to
+    grant each of those devices separate regions:
+
+        static char regions[]  __initdata = "v=20M;c=20M;common=5M";
+        static char map[]      __initdata = "video=v;camera=c";
+        static char asterisk[] __initdata = "common";
+
+        cma_set_defaults(regions, map, asterisk);
+
+    This solution also shows how with CMA you can assign private pools
+    of memory to each device if that is required.
+
+    Allocation mechanisms can be replaced dynamically in a similar
+    manner as well. Let's say that during testing, it has been
+    discovered that, for a given shared region of 40 MiB,
+    fragmentation has become a problem.  It has been observed that,
+    after some time, it becomes impossible to allocate buffers of the
+    required sizes. So to satisfy our requirements, we would have to
+    reserve a larger shared region beforehand.
+
+    But fortunately, you have also managed to develop a new allocation
+    algorithm -- Neat Allocation Algorithm or "na" for short -- which
+    satisfies the needs for both devices even on a 30 MiB region.  The
+    configuration can be then quickly changed to:
+
+        static char regions[]  __initdata = "region=30M:na;common=5M";
+        static char map[]      __initdata = "video,camera=region";
+        static char asterisk[] __initdata = "common";
+
+        cma_set_defaults(regions, map, asterisk);
+
+    This shows how you can develop your own allocation algorithms if
+    the ones provided with CMA do not suit your needs and easily
+    replace them, without the need to modify CMA core or even
+    recompiling the kernel.
+
+** Technical Details
+
+*** The attributes
+
+    As shown above, CMA is configured by a set of attributes: regions,
+    map and asterisk.  The first one specifies regions that are to be
+    reserved for CMA.  The second one specifies what regions each
+    device is assigned to.  The third one specifies what regions
+    devices without rule in map should have access to.
+
+**** Regions
+
+     The format of the "regions" attribute is as follows:
+
+         regions-attr  ::= [ regions [ ';' ] ]
+         regions       ::= region [ ';' regions ]
+
+         region        ::= [ '-' ] REG-NAME
+                             '=' size
+                           [ '@' start ]
+                           [ '/' alignment ]
+                           [ ':' ALLOC-NAME ]
+
+         size          ::= MEMSIZE   // size of the region
+         start         ::= MEMSIZE   // desired start address of
+                                     // the region
+         alignment     ::= MEMSIZE   // alignment of the start
+                                     // address of the region
+
+     REG-NAME specifies the name of the region.  All regions given at
+     via the regions attribute need to have a name.  Moreover, all
+     regions need to have a unique name.  If two regions have the same
+     name it is unspecified which will be used when requesting to
+     allocate memory from region with given name.
+
+     ALLOC-NAME specifies the name of allocator to be used with the
+     region.  If no allocator name is provided, the "default"
+     allocator will be used with the region.  The "default" allocator
+     is, of course, the first allocator that has been registered. ;)
+
+     size, start and alignment are specified in bytes with suffixes
+     that memparse() accept.  If start is given, the region will be
+     reserved on given starting address (or at close to it as
+     possible).  If alignment is specified, the region will be aligned
+     to given value.
+
+     This can be also specified by the command line argument "cma".
+     If it is, platform's defaults won't be used.
+
+**** Map
+
+     The format of the "map" attribute is as follows:
+
+         map-attr      ::= [ rules [ ';' ] ]
+         rules         ::= rule [ ';' rules ]
+         rule          ::= patterns '=' regions
+
+         patterns      ::= pattern [ ',' patterns ]
+
+         regions       ::= REG-NAME [ ',' regions ] | '*'
+                       // list of regions to try to allocate memory
+                       // from
+
+         pattern       ::= dev-pattern [ '/' kind-pattern ]
+                       | '/' kind-pattern
+                       // pattern request must match for the rule to
+                       // apply; the first rule that matches is
+                       // applied; if dev-pattern part is omitted
+                       // value identical to the one used in previous
+                       // pattern is assumed
+
+         dev-pattern   ::= PATTERN-STR
+                       // pattern that device name must match for the
+                       // rule to apply.
+         kind-pattern  ::= PATTERN-STR
+                       // pattern that "kind" of memory (provided by
+                       // device) must match for the rule to apply.
+
+     The "map" attribute is a sequence of rules which specify what
+     regions should given (device, kind) pair use.  The first rule
+     that matches is applied.
+
+     For rule to match, the pattern must match (dev, kind) pair.
+     Pattern consist of the part before and after slash.  The first
+     part must match device name and the second part must match kind.
+
+     If the first part is empty, the device name is assumed to match
+     iff it matched in previous pattern.
+
+     Not specifying the second part matches only empty, or
+     not-specified kind.
+
+     Patterns may contain question marks which mach any characters and
+     end with an asterisk which match the rest of the string
+     (including nothing).
+
+     The '*' as the list of regions means to use all the regions
+     listed in the "asterisk" attribute.  Moreover, if no rule matches
+     a device the list of regions from the "asterisk" attribute is
+     used as well (ie. a "*/*=*" rule is assumed at the end).
+
+     This attribute is accessible via SysFS and can be changed at
+     run-time by writing to /sys/kernel/mm/contiguous/map.
+
+**** Asterisk
+
+     The format of the "asterisk" attribute is as follows:
+
+         asterisk-attr ::= [ regions ]
+         regions       ::= REG-NAME [ ',' regions ]
+
+     If no rule from "map" attribute match given driver, the regions
+     specified by the "asterisk" attribute are used.
+
+     If this attribute is not specified by call to cma_set_defaults()
+     it is built automatically from the regions listed in the
+     "regions" rule.  All regions are included expect for the ones
+     which name is prefixed by a minus sign.
+
+     For instance, the following:
+
+         static char regions[]  __initdata = "region=20M;common=5M";
+         static char map[]      __initdata = "video,camera=region";
+         static char asterisk[] __initdata = "common";
+
+     can be replaced by:
+
+         static char regions[]  __initdata = "-region=20M;common=5M";
+         static char map[]      __initdata = "video,camera=region";
+
+     The biggest difference is that if user specifies the "cma"
+     command line parameter and platform did not specifies the
+     "asterisk" attribute it will be built from the user-supplied
+     "cma" parameter.  If "asterisk" attribute were to be specified by
+     platform code, it would not be created from command line
+     parameter.
+
+     This attribute is accessible via SysFS and can be changed at
+     run-time by writing to /sys/kernel/mm/contiguous/asterisk.
+
+**** Examples
+
+     Some examples (whitespace added for better readability):
+
+         cma = r1 = 64M       // 64M region
+                    @512M       // starting at address 512M
+                                // (or at least as near as possible)
+                    /1M         // make sure it's aligned to 1M
+                    :foo(bar);  // uses allocator "foo" with "bar"
+                                // as parameters for it
+               r2 = 64M       // 64M region
+                    /1M;        // make sure it's aligned to 1M
+                                // uses the first available allocator
+               r3 = 64M       // 64M region
+                    @512M       // starting at address 512M
+                    :foo;       // uses allocator "foo" with no parameters
+
+         cma_map = foo = r1;
+                       // device foo with kind==NULL uses region r1
+
+                   foo/quaz = r2;  // OR:
+                   /quaz = r2;
+                       // device foo with kind == "quaz" uses region r2
+
+                   foo/* = r3;     // OR:
+                   /* = r3;
+                       // device foo with any other kind uses region r3
+
+                   bar/* = r1,r2;
+                       // device bar with any kind uses region r1 or r2
+
+                   baz?/a* , baz?/b* = r3;
+                       // devices named baz? where ? is any character
+                       // with kind being a string starting with "a" or
+                       // "b" use r3
+
+*** The device and kind of memory
+
+    The name of the device is taken form the device structure.  It is
+    not possible to use CMA if driver does not register a device
+    (actually this can be overcome if a fake device structure is
+    provided with at least the name set).
+
+    The kind of memory is an optional argument provided by the device
+    whenever it requests memory chunk.  In many cases this can be
+    ignored but sometimes it may be required for some devices.
+
+    For instance, let say that there are two memory banks and for
+    performance reasons a device uses buffers in both of them.  In
+    such case, the device driver would define two kinds and use it for
+    different buffers.  CMA attributes could look as follows:
+
+         static char regions[]  __initdata = "a=32M@0;b=32M@512M";
+         static char map[]      __initdata = "foo/a=a;foo/b=b";
+
+    And whenever the driver allocated the memory it would specify the
+    kind of memory:
+
+        buffer1 = cma_alloc(dev, "a", 1 << 20, 0);
+        buffer2 = cma_alloc(dev, "b", 1 << 20, 0);
+
+    If it was needed to try to allocate from the other bank as well if
+    the dedicated one is full, the attributes could be changed to:
+
+         static char regions[]  __initdata = "a=32M@0;b=32M@512M";
+         static char map[]      __initdata = "foo/a=a,b;foo/b=b,a";
+
+    On the other hand, if the same driver was used on a system with
+    only one bank, the command line could be changed to:
+
+         static char regions[]  __initdata = "r=64";
+         static char map[]      __initdata = "foo/*=r";
+
+    without the need to change the driver at all.
+
+*** Device API
+
+    There are four calls provided by the CMA framework to devices.  To
+    allocate a chunk of memory cma_alloc() function needs to be used:
+
+        dma_addr_t cma_alloc(const struct device *dev, const char *kind,
+                             size_t size, dma_addr_t alignment);
+
+    If required, device may specify alignment in bytes that the chunk
+    need to satisfy.  It have to be a power of two or zero.  The
+    chunks are always aligned at least to a page.
+
+    The kind specifies the kind of memory as described to in the
+    previous subsection.  If device driver does not use notion of
+    memory kinds it's safe to pass NULL as the kind.
+
+    The basic usage of the function is just a:
+
+        addr = cma_alloc(dev, NULL, size, 0);
+
+    The function returns physical address of allocated chunk or
+    a value that evaluated true if checked with IS_ERR_VALUE(), so the
+    correct way for checking for errors is:
+
+        unsigned long addr = cma_alloc(dev, size);
+        if (IS_ERR_VALUE(addr))
+                return (int)addr;
+        /* Allocated */
+
+    (Make sure to include <linux/err.h> which contains the definition
+    of the IS_ERR_VALUE() macro.)
+
+
+    Allocated chunk is freed via a cma_put() function:
+
+        int cma_put(dma_addr_t addr);
+
+    It takes physical address of the chunk as an argument and
+    decreases it's reference counter.  If the counter reaches zero the
+    chunk is freed.  Most of the time users do not need to think about
+    reference counter and simply use the cma_put() as a free call.
+
+    If one, however, were to share a chunk with others built in
+    reference counter may turn out to be handy.  To increment it, one
+    needs to use cma_get() function:
+
+        int cma_get(dma_addr_t addr);
+
+
+    The last function is the cma_info() which returns information
+    about regions assigned to given (dev, kind) pair.  Its syntax is:
+
+        int cma_info(struct cma_info *info,
+                     const struct device *dev,
+                     const char *kind);
+
+    On successful exit it fills the info structure with lower and
+    upper bound of regions, total size and number of regions assigned
+    to given (dev, kind) pair.
+
+**** Dynamic and private regions
+
+     In the basic setup, regions are provided via command line argument
+     and then allocated by platform initialisation code (which usually
+     calls cma_regions_allocate() function).
+
+     It is, however, possible to create and add regions dynamically
+     using cma_region_register() function.
+
+         int cma_region_register(struct cma_region *reg);
+
+     The region does not have to have name.  If it does not, it won't
+     be accessed via standard mapping (the one provided with cma_map
+     parameter).  Such regions are private and to allocate chunk on
+     them, one needs to call:
+
+         dma_addr_t cma_alloc_from_region(struct cma_region *reg,
+                                          size_t size, dma_addr_t alignment);
+
+     It is just like cma_alloc() expect one specifies what region to
+     allocate memory from.  The region must have been registered.
+
+**** Allocating from region specified by name
+
+     If a driver preferred allocating from a region or list of regions
+     it knows name of it can use a different call simmilar to the
+     previous:
+
+         dma_addr_t cma_alloc_from(const char *regions,
+                                   size_t size, dma_addr_t alignment);
+
+     The first argument is a comma-separated list of regions the
+     driver desires CMA to try and allocate from.  The list is
+     terminated by NUL byte or a semicolon.
+
+     Similarly, there is a call for requesting information about named
+     regions:
+
+        int cma_info_about(struct cma_info *info, const char *regions);
+
+     Generally, it should not be needed to use those interfaces but
+     they are provided nevertheless.
+
+**** Registering early regions
+
+     An early region is a region that is managed by CMA early during
+     boot process.  It's platforms responsibility to reserve memory
+     for early regions.  Later on, when CMA initialises early regions
+     with reserved memory are registered as normal regions.
+     Registering an early region may be a way for a device to request
+     a private pool of memory without worrying about actually
+     reserving the memory:
+
+         int cma_early_region_register(struct cma_region *reg);
+
+     This needs to be done quite early on in boot process, before
+     platform traverses the cma_early_regions list to reserve memory.
+
+     When boot process ends, device driver may see whether the region
+     was reserved (by checking reg->reserved flag) and if so, whether
+     it was successfully registered as a normal region (by checking
+     the reg->registered flag).  If that is the case, device driver
+     can use normal API calls to use the region.
+
+*** Allocator operations
+
+    Creating an allocator for CMA needs four functions to be
+    implemented.
+
+
+    The first two are used to initialise an allocator far given driver
+    and clean up afterwards:
+
+        int  cma_foo_init(struct cma_region *reg);
+        void cma_foo_done(struct cma_region *reg);
+
+    The first is called when allocater is attached to region.  The
+    cma_region structure has saved starting address of the region as
+    well as its size.  Any data that allocate associated with the
+    region can be saved in private_data field.
+
+    The second call cleans up and frees all resources the allocator
+    has allocated for the region.  The function can assume that all
+    chunks allocated form this region have been freed thus the whole
+    region is free.
+
+
+    The two other calls are used for allocating and freeing chunks.
+    They are:
+
+        struct cma_chunk *cma_foo_alloc(struct cma_region *reg,
+                                        size_t size, dma_addr_t alignment);
+        void cma_foo_free(struct cma_chunk *chunk);
+
+    As names imply the first allocates a chunk and the other frees
+    a chunk of memory.  It also manages a cma_chunk object
+    representing the chunk in physical memory.
+
+    Either of those function can assume that they are the only thread
+    accessing the region.  Therefore, allocator does not need to worry
+    about concurrency.  Moreover, all arguments are guaranteed to be
+    valid (i.e. page aligned size, a power of two alignment no lower
+    the a page size).
+
+
+    When allocator is ready, all that is left is to register it by
+    calling cma_allocator_register() function:
+
+            int cma_allocator_register(struct cma_allocator *alloc);
+
+    The argument is an structure with pointers to the above functions
+    and allocator's name.  The whole call may look something like
+    this:
+
+        static struct cma_allocator alloc = {
+                .name    = "foo",
+                .init    = cma_foo_init,
+                .cleanup = cma_foo_cleanup,
+                .alloc   = cma_foo_alloc,
+                .free    = cma_foo_free,
+        };
+        return cma_allocator_register(&alloc);
+
+    The name ("foo") will be available to use with command line
+    argument.
+
+*** Integration with platform
+
+    There is one function that needs to be called form platform
+    initialisation code.  That is the cma_early_regions_reserve()
+    function:
+
+        void cma_early_regions_reserve(int (*reserve)(struct cma_region *reg));
+
+    It traverses list of all of the regions given on command line and
+    reserves memory for them.  The only argument is a callback
+    function used to reserve the region.  Passing NULL as the argument
+    makes the function use cma_early_region_reserve() function which
+    uses bootmem and memblock for allocating.
+
+    Alternatively, platform code could traverse the cma_early_regions
+    list by itself but this should not be necessary.
+
+
+    Platform has also a way of providing default attributes for CMA,
+    cma_set_defaults() function is used for that purpose:
+
+        int __init cma_set_defaults(char *regions, const char *map,
+                                    const char *asterisk);
+
+    It needs to be called after early params have been parsed but
+    prior to reserving regions.  It let one specify the three
+    attributes used to configure CMA.  Any of the pointer may point to
+    a string in __initdata section.  The regions argument must be
+    modifyable though.
+
+** Future work
+
+    In the future, implementation of mechanisms that would allow the
+    free space inside the regions to be used as page cache, filesystem
+    buffers or swap devices is planned.  With such mechanisms, the
+    memory would not be wasted when not used.
+
+    Because all allocations and freeing of chunks pass the CMA
+    framework it can follow what parts of the reserved memory are
+    freed and what parts are allocated.  Tracking the unused memory
+    would let CMA use it for other purposes such as page cache, I/O
+    buffers, swap, etc.
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index a698255..5c7af89 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -43,6 +43,7 @@ parameter is applicable:
 	AVR32	AVR32 architecture is enabled.
 	AX25	Appropriate AX.25 support is enabled.
 	BLACKFIN Blackfin architecture is enabled.
+	CMA	Contiguous Memory Allocator is enabled.
 	DRM	Direct Rendering Management support is enabled.
 	EDD	BIOS Enhanced Disk Drive Services (EDD) is enabled
 	EFI	EFI Partitioning (GPT) is enabled
@@ -476,6 +477,9 @@ and is between 256 and 4096 characters. It is defined in the file
 			Also note the kernel might malfunction if you disable
 			some critical bits.
 
+	cma=		[CMA] List of CMA regions.
+			See Documentation/contiguous-memory.txt for details.
+
 	cmo_free_hint=	[PPC] Format: { yes | no }
 			Specify whether pages are marked as being inactive
 			when they are freed.  This is used in CMO environments
diff --git a/include/linux/cma.h b/include/linux/cma.h
new file mode 100644
index 0000000..96a8652
--- /dev/null
+++ b/include/linux/cma.h
@@ -0,0 +1,445 @@
+#ifndef __LINUX_CMA_H
+#define __LINUX_CMA_H
+
+/*
+ * Contiguous Memory Allocator framework
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ */
+
+/*
+ * See Documentation/contiguous-memory.txt for details.
+ */
+
+/***************************** Kernel lever API *****************************/
+
+#ifdef __KERNEL__
+
+#include <linux/kref.h>
+#include <linux/rbtree.h>
+#include <linux/list.h>
+
+
+struct device;
+struct cma_info;
+
+/*
+ * Don't call it directly, use cma_alloc(), cma_alloc_from() or
+ * cma_alloc_from_region().
+ */
+dma_addr_t __must_check
+__cma_alloc(const struct device *dev, const char *kind,
+	    size_t size, dma_addr_t alignment);
+
+/* Don't call it directly, use cma_info() or cma_info_about(). */
+int
+__cma_info(struct cma_info *info, const struct device *dev, const char *kind);
+
+
+/**
+ * cma_alloc - allocates contiguous chunk of memory.
+ * @dev:	The device to perform allocation for.
+ * @kind:	A kind of memory to allocate.  A device may use several
+ * 		different kinds of memory which are configured
+ * 		separately.  Usually it's safe to pass NULL here.
+ * @size:	Size of the memory to allocate in bytes.
+ * @alignment:	Desired alignment in bytes.  Must be a power of two or
+ * 		zero.  If alignment is less then a page size it will be
+ * 		set to page size. If unsure, pass zero here.
+ *
+ * On error returns a negative error cast to dma_addr_t.  Use
+ * IS_ERR_VALUE() to check if returned value is indeed an error.
+ * Otherwise physical address of the chunk is returned.
+ */
+static inline dma_addr_t __must_check
+cma_alloc(const struct device *dev, const char *kind,
+	  size_t size, dma_addr_t alignment)
+{
+	return dev ? -EINVAL : __cma_alloc(dev, kind, size, alignment);
+}
+
+
+/**
+ * struct cma_info - information about regions returned by cma_info().
+ * @lower_bound:	The smallest address that is possible to be
+ * 			allocated for given (dev, kind) pair.
+ * @upper_bound:	The one byte after the biggest address that is
+ * 			possible to be allocated for given (dev, kind)
+ * 			pair.
+ * @total_size:	Total size of regions mapped to (dev, kind) pair.
+ * @free_size:	Total free size in all of the regions mapped to (dev, kind)
+ * 		pair.  Because of possible race conditions, it is not
+ * 		guaranteed that the value will be correct -- it gives only
+ * 		an approximation.
+ * @count:	Number of regions mapped to (dev, kind) pair.
+ */
+struct cma_info {
+	dma_addr_t lower_bound, upper_bound;
+	size_t total_size, free_size;
+	unsigned count;
+};
+
+/**
+ * cma_info - queries information about regions.
+ * @info:	Pointer to a structure where to save the information.
+ * @dev:	The device to query information for.
+ * @kind:	A kind of memory to query information for.
+ * 		If unsure, pass NULL here.
+ *
+ * On error returns a negative error, zero otherwise.
+ */
+static inline int
+cma_info(struct cma_info *info, const struct device *dev, const char *kind)
+{
+	return dev ? -EINVAL : __cma_info(info, dev, kind);
+}
+
+
+/**
+ * cma_get - increases reference counter of a chunk.
+ * @addr:	Beginning of the chunk.
+ *
+ * Returns zero on success or -ENOENT if there is no chunk at given
+ * location.  In the latter case issues a warning and a stacktrace.
+ */
+int cma_get(dma_addr_t addr);
+
+/**
+ * cma_put - decreases reference counter of a chunk.
+ * @addr:	Beginning of the chunk.
+ *
+ * Returns one if the chunk has been freed, zero if it hasn't, and
+ * -ENOENT if there is no chunk at given location.  In the latter case
+ * issues a warning and a stacktrace.
+ *
+ * If this function returns zero, you still can not count on the area
+ * remaining in memory.  Only use the return value if you want to see
+ * if the area is now gone, not present.
+ */
+int cma_put(dma_addr_t addr);
+
+
+
+/****************************** Lower lever API *****************************/
+
+/**
+ * cma_alloc_from - allocates contiguous chunk of memory from named regions.
+ * @regions:	Comma separated list of region names.  Terminated by NUL
+ * 		byte or a semicolon.  "*" or NULL means to try all regions
+ * 		which are listed as asterisk regions.
+ * @size:	Size of the memory to allocate in bytes.
+ * @alignment:	Desired alignment in bytes.  Must be a power of two or
+ * 		zero.  If alignment is less then a page size it will be
+ * 		set to page size. If unsure, pass zero here.
+ *
+ * On error returns a negative error cast to dma_addr_t.  Use
+ * IS_ERR_VALUE() to check if returned value is indeed an error.
+ * Otherwise physical address of the chunk is returned.
+ */
+static inline dma_addr_t __must_check
+cma_alloc_from(const char *regions, size_t size, dma_addr_t alignment)
+{
+	return __cma_alloc(NULL, regions, size, alignment);
+}
+
+/**
+ * cma_info_about - queries information about named regions.
+ * @info:	Pointer to a structure where to save the information.
+ * @regions:	Comma separated list of region names.  Terminated by NUL
+ * 		byte or a semicolon.
+ *
+ * On error returns a negative error, zero otherwise.
+ */
+static inline int
+cma_info_about(struct cma_info *info, const const char *regions)
+{
+	return __cma_info(info, NULL, regions);
+}
+
+
+
+struct cma_allocator;
+
+/**
+ * struct cma_region - a region reserved for CMA allocations.
+ * @name:	Unique name of the region.  Read only.
+ * @start:	Physical starting address of the region in bytes.  Always
+ * 		aligned at least to a full page.  Read only.
+ * @size:	Size of the region in bytes.  Multiply of a page size.
+ * 		Read only.
+ * @free_space:	Free space in the region.  Read only.
+ * @alignment:	Desired alignment of the region in bytes.  A power of two,
+ * 		always at least page size.  Early.
+ * @alloc:	Allocator used with this region.  NULL means allocator is
+ * 		not attached.  Private.
+ * @alloc_name:	Allocator name read from cmdline.  Private.  This may be
+ * 		different from @alloc->name.
+ * @private_data:	Allocator's private data.
+ * @used:	Whether region was already used, ie. there was at least
+ * 		one allocation request for.  Private.
+ * @list:	Entry in list of regions.  Private.
+ * @registered:	Whenthe this region has been registered.  Read only.
+ * @reserved:	Whether this region has been reserved.  Early.  Read only.
+ * @copy_name:	Whether @name and @alloc_name needs to be copied when
+ * 		this region is converted from early to normal.  Early.
+ * 		Private.
+ *
+ * Regions come in two types: an early region and normal region.  The
+ * former can be reserved or not-reserved.  Fields marked as "early"
+ * are only meaningful in early regions.
+ *
+ * Early regions are important only during initialisation.  The list
+ * of early regions is built from the "cma" command line argument or
+ * platform defaults.  Platform initialisation code is responsible for
+ * reserving space for unreserved regions that are placed on
+ * cma_early_regions list.
+ *
+ * Later, during CMA initialisation all reserved regions from the
+ * cma_early_regions list are registered as normal regions and can be
+ * used using standard mechanisms.
+ */
+ /* * @users:	Number of chunks allocated in this region. */
+struct cma_region {
+	const char *name;
+	dma_addr_t start;
+	size_t size;
+	union {
+		size_t free_space;	/* Normal region */
+		dma_addr_t alignment;	/* Early region */
+	};
+
+	struct cma_allocator *alloc;
+	const char *alloc_name;
+	union {
+		void *private_data;	/* Normal region w/ allocator */
+		unsigned used;		/* Normal regien w/o allocator */
+	};
+
+	/* unsigned users; */
+	struct list_head list;
+
+	unsigned registered:1;
+	unsigned reserved:1;
+	unsigned copy_name:1;
+};
+
+
+/**
+ * cma_region_register() - registers a region.
+ * @reg:	Region to add.
+ *
+ * Region's start and size must be set.
+ *
+ * If name is set the region will be accessible using normal mechanism
+ * like mapping or cma_alloc_from() function otherwise it will be
+ * a private region and accessible only using the
+ * cma_alloc_from_region() function.
+ *
+ * If alloc is set function will try to initialise given allocator
+ * (and will return error if it failes).  Otherwise alloc_name may
+ * point to a name of an allocator to use (if not set, the default
+ * will be used).
+ *
+ * All other fields are ignored and/or overwritten.
+ *
+ * Returns zero or negative error.  In particular, -EADDRINUSE if
+ * region overlap with already existing region.
+ */
+int __must_check cma_region_register(struct cma_region *reg);
+
+
+/**
+ * cma_alloc_from_region() - allocates contiguous chunk of memory from region.
+ * @reg:	Region to allocate chunk from.
+ * @size:	Size of the memory to allocate in bytes.
+ * @alignment:	Desired alignment in bytes.  Must be a power of two or
+ * 		zero.  If alignment is less then a page size it will be
+ * 		set to page size. If unsure, pass zero here.
+ *
+ * On error returns a negative error cast to dma_addr_t.  Use
+ * IS_ERR_VALUE() to check if returned value is indeed an error.
+ * Otherwise physical address of the chunk is returned.
+ */
+dma_addr_t __must_check
+cma_alloc_from_region(struct cma_region *reg,
+		      size_t size, dma_addr_t alignment);
+
+
+
+/****************************** Allocators API ******************************/
+
+/**
+ * struct cma_chunk - an allocated contiguous chunk of memory.
+ * @start:	Physical address in bytes.
+ * @size:	Size in bytes.
+ * @free_space:	Free space in region in bytes.  Read only.
+ * @reg:	Region this chunk belongs to.
+ * @kref:	Number of references.  Private.
+ * @by_start:	A node in an red-black tree with all chunks sorted by
+ * 		start address.
+ *
+ * The cma_allocator::alloc() operation need to set only the @start
+ * and @size fields.  The rest is handled by the caller (ie. CMA
+ * glue).
+ */
+struct cma_chunk {
+	dma_addr_t start;
+	size_t size;
+
+	struct cma_region *reg;
+	struct kref ref;
+	struct rb_node by_start;
+};
+
+
+/**
+ * struct cma_allocator - a CMA allocator.
+ * @name:	Allocator's unique name
+ * @init:	Initialises an allocator on given region.
+ * @cleanup:	Cleans up after init.  May assume that there are no chunks
+ * 		allocated in given region.
+ * @alloc:	Allocates a chunk of memory of given size in bytes and
+ * 		with given alignment.  Alignment is a power of
+ * 		two (thus non-zero) and callback does not need to check it.
+ * 		May also assume that it is the only call that uses given
+ * 		region (ie. access to the region is synchronised with
+ * 		a mutex).  This has to allocate the chunk object (it may be
+ * 		contained in a bigger structure with allocator-specific data.
+ * 		Required.
+ * @free:	Frees allocated chunk.  May also assume that it is the only
+ * 		call that uses given region.  This has to free() the chunk
+ * 		object as well.  Required.
+ * @list:	Entry in list of allocators.  Private.
+ */
+ /* * @users:	How many regions use this allocator.  Private. */
+struct cma_allocator {
+	const char *name;
+
+	int (*init)(struct cma_region *reg);
+	void (*cleanup)(struct cma_region *reg);
+	struct cma_chunk *(*alloc)(struct cma_region *reg, size_t size,
+				   dma_addr_t alignment);
+	void (*free)(struct cma_chunk *chunk);
+
+	/* unsigned users; */
+	struct list_head list;
+};
+
+
+/**
+ * cma_allocator_register() - Registers an allocator.
+ * @alloc:	Allocator to register.
+ *
+ * Adds allocator to the list of allocators managed by CMA.
+ *
+ * All of the fields of cma_allocator structure must be set except for
+ * optional name and users and list which will be overriden.
+ *
+ * Returns zero or negative error code.
+ */
+int cma_allocator_register(struct cma_allocator *alloc);
+
+
+/**************************** Initialisation API ****************************/
+
+/**
+ * cma_set_defaults() - specifies default command line parameters.
+ * @regions:	Default regions attribute.  May be overriden by "cma"
+ * 		command line argument.
+ * @map:	Default map attribute.  If not set all devices will use
+ * 		regions specified by @asterisk attribute.
+ * @asterisk:	Default asterisk attribute.  If not set will be built from
+ * 		@regions.  Specifies list of regions to use for devices
+ * 		that are missing a rule in @map.
+ *
+ * This function should be called prior to cma_early_regions_reserve()
+ * and after early parameters have been parsed.  The arguments
+ * specifies defalts for various CMA attributes.
+ *
+ * All the pointers can point to strings in __initdata.  @regions
+ * needs to be modifyable.
+ *
+ * Returns negative error code if there was an error parsing the
+ * defs->regions parameter.  Zero otherwise.
+ */
+int __init cma_set_defaults(char *regions, const char *map,
+                            const char *asterisk);
+
+
+/**
+ * cma_early_regions - a list of early regions.
+ *
+ * Platform needs to allocate space for each of the region before
+ * initcalls are executed.  If space is reserved, the reserved flag
+ * must be set.  Platform initialisation code may choose to use
+ * cma_early_regions_allocate().
+ *
+ * Later, during CMA initialisation all reserved regions from the
+ * cma_early_regions list are registered as normal regions and can be
+ * used using standard mechanisms.
+ */
+extern struct list_head cma_early_regions __initdata;
+
+
+/**
+ * cma_early_region_register() - registers an early region.
+ * @reg:	Region to add.
+ *
+ * Region's start, size and alignment must be set.
+ *
+ * If name is set the region will be accessible using normal mechanism
+ * like mapping or cma_alloc_from() function otherwise it will be
+ * a private region accessible only using the cma_alloc_from_region().
+ *
+ * If alloc is set function will try to initialise given allocator
+ * when the early region is "converted" to normal region and
+ * registered during CMA initialisation.  If this failes, the space
+ * will still be reserved but the region won't be registered.
+ *
+ * As usually, alloc_name may point to a name of an allocator to use
+ * (if both alloc and alloc_name aret set, the default will be used).
+ *
+ * All other fields are ignored and/or overwritten.
+ *
+ * Returns zero or negative error.  No checking if regions overlap is
+ * performed.
+ */
+int __init __must_check cma_early_region_register(struct cma_region *reg);
+
+
+/**
+ * cma_early_region_reserve() - reserves a physically contiguous memory region.
+ * @reg:	Early region to reserve memory for.
+ *
+ * If platform supports bootmem this is the first allocator this
+ * function tries to use.  If that failes (or bootmem is not
+ * supported) function tries to use memblec if it is available.
+ *
+ * On success sets reg->reserved flag.
+ *
+ * Returns zero or negative error.
+ */
+int __init cma_early_region_reserve(struct cma_region *reg);
+
+/**
+ * cma_early_regions_reserver() - helper function for reserving early regions.
+ * @reserve:	Callbac function used to reserve space for region.  Needs
+ * 		to return non-negative if allocation succeeded, negative
+ * 		error otherwise.  NULL means cma_early_region_alloc() will
+ * 		be used.
+ *
+ * This function traverses the %cma_early_regions list and tries to
+ * reserve memory for each early region.  It uses the @reserve
+ * callback function for that purpose.  The reserved flag of each
+ * region is updated accordingly.
+ */
+void __init cma_early_regions_reserve(int (*reserve)(struct cma_region *reg));
+
+#else
+
+#define cma_defaults(regions, map, asterisk) ((int)0)
+#define cma_early_regions_reserve(reserve)   do { } while (0)
+
+#endif
+
+#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index f4e516e..468b874 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -301,3 +301,37 @@ config NOMMU_INITIAL_TRIM_EXCESS
 	  of 1 says that all excess pages should be trimmed.
 
 	  See Documentation/nommu-mmap.txt for more information.
+
+
+config CMA
+	bool "Contiguous Memory Allocator framework"
+	# Currently there is only one allocator so force it on
+	select CMA_BEST_FIT
+	help
+	  This enables the Contiguous Memory Allocator framework which
+	  allows drivers to allocate big physically-contiguous blocks of
+	  memory for use with hardware components that do not support I/O
+	  map nor scatter-gather.
+
+	  If you select this option you will also have to select at least
+	  one allocator algorithm below.
+
+	  To make use of CMA you need to specify the regions and
+	  driver->region mapping on command line when booting the kernel.
+
+config CMA_DEBUG
+	bool "CMA debug messages"
+	depends on CMA
+	help
+	  Enable debug messages in CMA code.
+
+config CMA_BEST_FIT
+	bool "CMA best-fit allocator"
+	depends on CMA
+	default y
+	help
+	  This is a best-fit algorithm running in O(n log n) time where
+	  n is the number of existing holes (which is never greater then
+	  the number of allocated regions and usually much smaller).  It
+	  allocates area from the smallest hole that is big enough for
+	  allocation in question.
diff --git a/mm/Makefile b/mm/Makefile
index 34b2546..54b0e99 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -47,3 +47,6 @@ obj-$(CONFIG_MEMORY_FAILURE) += memory-failure.o
 obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o
 obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
 obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
+
+obj-$(CONFIG_CMA) += cma.o
+obj-$(CONFIG_CMA_BEST_FIT) += cma-best-fit.o
diff --git a/mm/cma-best-fit.c b/mm/cma-best-fit.c
new file mode 100644
index 0000000..59515f9
--- /dev/null
+++ b/mm/cma-best-fit.c
@@ -0,0 +1,407 @@
+/*
+ * Contiguous Memory Allocator framework: Best Fit allocator
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+#define pr_fmt(fmt) "cma: bf: " fmt
+
+#ifdef CONFIG_CMA_DEBUG
+#  define DEBUG
+#endif
+
+#include <linux/errno.h>       /* Error numbers */
+#include <linux/slab.h>        /* kmalloc() */
+
+#include <linux/cma.h>         /* CMA structures */
+
+
+/************************* Data Types *************************/
+
+struct cma_bf_item {
+	struct cma_chunk ch;
+	struct rb_node by_size;
+};
+
+struct cma_bf_private {
+	struct rb_root by_start_root;
+	struct rb_root by_size_root;
+};
+
+
+/************************* Prototypes *************************/
+
+/*
+ * Those are only for holes.  They must be called whenever hole's
+ * properties change but also whenever chunk becomes a hole or hole
+ * becames a chunk.
+ */
+static void __cma_bf_hole_insert_by_size(struct cma_bf_item *item);
+static void __cma_bf_hole_erase_by_size(struct cma_bf_item *item);
+static int  __must_check
+            __cma_bf_hole_insert_by_start(struct cma_bf_item *item);
+static void __cma_bf_hole_erase_by_start(struct cma_bf_item *item);
+
+/**
+ * __cma_bf_hole_take - takes a chunk of memory out of a hole.
+ * @hole:	hole to take chunk from
+ * @size:	chunk's size
+ * @alignment:	chunk's starting address alignment (must be power of two)
+ *
+ * Takes a @size bytes large chunk from hole @hole which must be able
+ * to hold the chunk.  The "must be able" includes also alignment
+ * constraint.
+ *
+ * Returns allocated item or NULL on error (if kmalloc() failed).
+ */
+static struct cma_bf_item *__must_check
+__cma_bf_hole_take(struct cma_bf_item *hole, size_t size, dma_addr_t alignment);
+
+/**
+ * __cma_bf_hole_merge_maybe - tries to merge hole with neighbours.
+ * @item: hole to try and merge
+ *
+ * Which items are preserved is undefined so you may not rely on it.
+ */
+static void __cma_bf_hole_merge_maybe(struct cma_bf_item *item);
+
+
+/************************* Device API *************************/
+
+int cma_bf_init(struct cma_region *reg)
+{
+	struct cma_bf_private *prv;
+	struct cma_bf_item *item;
+
+	prv = kzalloc(sizeof *prv, GFP_KERNEL);
+	if (unlikely(!prv))
+		return -ENOMEM;
+
+	item = kzalloc(sizeof *item, GFP_KERNEL);
+	if (unlikely(!item)) {
+		kfree(prv);
+		return -ENOMEM;
+	}
+
+	item->ch.start = reg->start;
+	item->ch.size  = reg->size;
+	item->ch.reg   = reg;
+
+	rb_root_init(&prv->by_start_root, &item->ch.by_start);
+	rb_root_init(&prv->by_size_root, &item->by_size);
+
+	reg->private_data = prv;
+	return 0;
+}
+
+void cma_bf_cleanup(struct cma_region *reg)
+{
+	struct cma_bf_private *prv = reg->private_data;
+	struct cma_bf_item *item =
+		rb_entry(prv->by_size_root.rb_node,
+			 struct cma_bf_item, by_size);
+
+	/* We can assume there is only a single hole in the tree. */
+	WARN_ON(item->by_size.rb_left || item->by_size.rb_right ||
+		item->ch.by_start.rb_left || item->ch.by_start.rb_right);
+
+	kfree(item);
+	kfree(prv);
+}
+
+struct cma_chunk *cma_bf_alloc(struct cma_region *reg,
+			       size_t size, dma_addr_t alignment)
+{
+	struct cma_bf_private *prv = reg->private_data;
+	struct rb_node *node = prv->by_size_root.rb_node;
+	struct cma_bf_item *item = NULL;
+
+	/* First find hole that is large enough */
+	while (node) {
+		struct cma_bf_item *i =
+			rb_entry(node, struct cma_bf_item, by_size);
+
+		if (i->ch.size < size) {
+			node = node->rb_right;
+		} else if (i->ch.size >= size) {
+			node = node->rb_left;
+			item = i;
+		}
+	}
+	if (!item)
+		return NULL;
+
+	/* Now look for items which can satisfy alignment requirements */
+	for (;;) {
+		dma_addr_t start = ALIGN(item->ch.start, alignment);
+		dma_addr_t end   = item->ch.start + item->ch.size;
+		if (start < end && end - start >= size) {
+			item = __cma_bf_hole_take(item, size, alignment);
+			return likely(item) ? &item->ch : NULL;
+		}
+
+		node = rb_next(node);
+		if (!node)
+			return NULL;
+
+		item  = rb_entry(node, struct cma_bf_item, by_size);
+	}
+}
+
+void cma_bf_free(struct cma_chunk *chunk)
+{
+	struct cma_bf_item *item = container_of(chunk, struct cma_bf_item, ch);
+
+	/* Add new hole */
+	if (unlikely(__cma_bf_hole_insert_by_start(item))) {
+		/*
+		 * We're screwed...  Just free the item and forget
+		 * about it.  Things are broken beyond repair so no
+		 * sense in trying to recover.
+		 */
+		kfree(item);
+	} else {
+		__cma_bf_hole_insert_by_size(item);
+
+		/* Merge with prev and next sibling */
+		__cma_bf_hole_merge_maybe(item);
+	}
+}
+
+
+/************************* Basic Tree Manipulation *************************/
+
+static void __cma_bf_hole_insert_by_size(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	struct rb_node **link = &prv->by_size_root.rb_node, *parent = NULL;
+	const typeof(item->ch.size) value = item->ch.size;
+
+	while (*link) {
+		struct cma_bf_item *i;
+		parent = *link;
+		i = rb_entry(parent, struct cma_bf_item, by_size);
+		link = value <= i->ch.size
+			? &parent->rb_left
+			: &parent->rb_right;
+	}
+
+	rb_link_node(&item->by_size, parent, link);
+	rb_insert_color(&item->by_size, &prv->by_size_root);
+}
+
+static void __cma_bf_hole_erase_by_size(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	rb_erase(&item->by_size, &prv->by_size_root);
+}
+
+static int  __must_check
+            __cma_bf_hole_insert_by_start(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	struct rb_node **link = &prv->by_start_root.rb_node, *parent = NULL;
+	const typeof(item->ch.start) value = item->ch.start;
+
+	while (*link) {
+		struct cma_bf_item *i;
+		parent = *link;
+		i = rb_entry(parent, struct cma_bf_item, ch.by_start);
+
+		if (WARN_ON(value == i->ch.start))
+			/*
+			 * This should *never* happen.  And I mean
+			 * *never*.  We could even BUG on it but
+			 * hopefully things are only a bit broken,
+			 * ie. system can still run.  We produce
+			 * a warning and return an error.
+			 */
+			return -EBUSY;
+
+		link = value <= i->ch.start
+			? &parent->rb_left
+			: &parent->rb_right;
+	}
+
+	rb_link_node(&item->ch.by_start, parent, link);
+	rb_insert_color(&item->ch.by_start, &prv->by_start_root);
+	return 0;
+}
+
+static void __cma_bf_hole_erase_by_start(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	rb_erase(&item->ch.by_start, &prv->by_start_root);
+}
+
+
+/************************* More Tree Manipulation *************************/
+
+static struct cma_bf_item *__must_check
+__cma_bf_hole_take(struct cma_bf_item *hole, size_t size, size_t alignment)
+{
+	struct cma_bf_item *item;
+
+	/*
+	 * There are three cases:
+	 * 1. the chunk takes the whole hole,
+	 * 2. the chunk is at the beginning or at the end of the hole, or
+	 * 3. the chunk is in the middle of the hole.
+	 */
+
+
+	/* Case 1, the whole hole */
+	if (size == hole->ch.size) {
+		__cma_bf_hole_erase_by_size(hole);
+		__cma_bf_hole_erase_by_start(hole);
+		return hole;
+	}
+
+
+	/* Allocate */
+	item = kmalloc(sizeof *item, GFP_KERNEL);
+	if (unlikely(!item))
+		return NULL;
+
+	item->ch.start = ALIGN(hole->ch.start, alignment);
+	item->ch.size  = size;
+
+	/* Case 3, in the middle */
+	if (item->ch.start != hole->ch.start
+	 && item->ch.start + item->ch.size !=
+	    hole->ch.start + hole->ch.size) {
+		struct cma_bf_item *tail;
+
+		/*
+		 * Space between the end of the chunk and the end of
+		 * the region, ie. space left after the end of the
+		 * chunk.  If this is dividable by alignment we can
+		 * move the chunk to the end of the hole.
+		 */
+		size_t left =
+			hole->ch.start + hole->ch.size -
+			(item->ch.start + item->ch.size);
+		if (left % alignment == 0) {
+			item->ch.start += left;
+			goto case_2;
+		}
+
+		/*
+		 * We are going to add a hole at the end.  This way,
+		 * we will reduce the problem to case 2 -- the chunk
+		 * will be at the end of the hole.
+		 */
+		tail = kmalloc(sizeof *tail, GFP_KERNEL);
+		if (unlikely(!tail)) {
+			kfree(item);
+			return NULL;
+		}
+
+		tail->ch.start = item->ch.start + item->ch.size;
+		tail->ch.size  =
+			hole->ch.start + hole->ch.size - tail->ch.start;
+		tail->ch.reg   = hole->ch.reg;
+
+		if (unlikely(__cma_bf_hole_insert_by_start(tail))) {
+			/*
+			 * Things are broken beyond repair...  Abort
+			 * inserting the hole but still continue with
+			 * allocation (seems like the best we can do).
+			 */
+
+			hole->ch.size = tail->ch.start - hole->ch.start;
+			kfree(tail);
+		} else {
+			__cma_bf_hole_insert_by_size(tail);
+			/*
+			 * It's important that we first insert the new
+			 * hole in the tree sorted by size and later
+			 * reduce the size of the old hole.  We will
+			 * update the position of the old hole in the
+			 * rb tree in code that handles case 2.
+			 */
+			hole->ch.size = tail->ch.start - hole->ch.start;
+		}
+
+		/* Go to case 2 */
+	}
+
+
+	/* Case 2, at the beginning or at the end */
+case_2:
+	/* No need to update the tree; order preserved. */
+	if (item->ch.start == hole->ch.start)
+		hole->ch.start += item->ch.size;
+
+	/* Alter hole's size */
+	hole->ch.size -= size;
+	__cma_bf_hole_erase_by_size(hole);
+	__cma_bf_hole_insert_by_size(hole);
+
+	return item;
+}
+
+
+static void __cma_bf_hole_merge_maybe(struct cma_bf_item *item)
+{
+	struct cma_bf_item *prev;
+	struct rb_node *node;
+	int twice = 2;
+
+	node = rb_prev(&item->ch.by_start);
+	if (unlikely(!node))
+		goto next;
+	prev = rb_entry(node, struct cma_bf_item, ch.by_start);
+
+	for (;;) {
+		if (prev->ch.start + prev->ch.size == item->ch.start) {
+			/* Remove previous hole from trees */
+			__cma_bf_hole_erase_by_size(prev);
+			__cma_bf_hole_erase_by_start(prev);
+
+			/* Alter this hole */
+			item->ch.size += prev->ch.size;
+			item->ch.start = prev->ch.start;
+			__cma_bf_hole_erase_by_size(item);
+			__cma_bf_hole_insert_by_size(item);
+			/*
+			 * No need to update by start trees as we do
+			 * not break sequence order
+			 */
+
+			/* Free prev hole */
+			kfree(prev);
+		}
+
+next:
+		if (!--twice)
+			break;
+
+		node = rb_next(&item->ch.by_start);
+		if (unlikely(!node))
+			break;
+		prev = item;
+		item = rb_entry(node, struct cma_bf_item, ch.by_start);
+	}
+}
+
+
+
+/************************* Register *************************/
+static int cma_bf_module_init(void)
+{
+	static struct cma_allocator alloc = {
+		.name    = "bf",
+		.init    = cma_bf_init,
+		.cleanup = cma_bf_cleanup,
+		.alloc   = cma_bf_alloc,
+		.free    = cma_bf_free,
+	};
+	return cma_allocator_register(&alloc);
+}
+module_init(cma_bf_module_init);
diff --git a/mm/cma.c b/mm/cma.c
new file mode 100644
index 0000000..0b02ab0
--- /dev/null
+++ b/mm/cma.c
@@ -0,0 +1,1170 @@
+/*
+ * Contiguous Memory Allocator framework
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/contiguous-memory.txt for details.
+ */
+
+#define pr_fmt(fmt) "cma: " fmt
+
+#ifdef CONFIG_CMA_DEBUG
+#  define DEBUG
+#endif
+
+#ifndef CONFIG_NO_BOOTMEM
+#  include <linux/bootmem.h>   /* alloc_bootmem_pages_nopanic() */
+#endif
+#ifdef CONFIG_HAVE_MEMBLOCK
+#  include <linux/memblock.h>  /* memblock*() */
+#endif
+#include <linux/device.h>      /* struct device, dev_name() */
+#include <linux/errno.h>       /* Error numbers */
+#include <linux/err.h>         /* IS_ERR, PTR_ERR, etc. */
+#include <linux/mm.h>          /* PAGE_ALIGN() */
+#include <linux/module.h>      /* EXPORT_SYMBOL_GPL() */
+#include <linux/mutex.h>       /* mutex */
+#include <linux/slab.h>        /* kmalloc() */
+#include <linux/string.h>      /* str*() */
+
+#include <linux/cma.h>
+
+
+/*
+ * Protects cma_regions, cma_allocators, cma_attr_map (which means
+ * cma_map and cma_map_len), cma_attr_asterisk (which means
+ * cma_asterisk and cma_asterisk_len) and cma_chunks_by_start.
+ */
+static DEFINE_MUTEX(cma_mutex);
+
+static struct kobject *cma_kobj;
+
+
+/************************* Attributes *************************/
+
+struct cma_attribute {
+	struct kobj_attribute kattr;
+	const char *value;
+	size_t length;
+	ssize_t (*validate)(const char *value);
+};
+
+#define __CMA_ATTR(_name) \
+	__ATTR(_name, 0644, cma_attr_show, cma_attr_store)
+
+static ssize_t cma_attr_show(struct kobject *kobj,
+			     struct kobj_attribute *_attr, char *page)
+{
+	struct cma_attribute *attr =
+		container_of(_attr, struct cma_attribute, kattr);
+	ssize_t len;
+
+	mutex_lock(&cma_mutex);
+
+	if (!attr->value || !attr->length) {
+		*page = 0;
+		len = 0;
+	} else {
+		len = min(attr->length, (size_t)PAGE_SIZE - 2);
+		memcpy(page, attr->value, len);
+		page[len]   = '\n';
+		page[++len] = '\0';
+	}
+
+	mutex_unlock(&cma_mutex);
+
+	return len;
+}
+
+static ssize_t cma_attr_store(struct kobject *kobj,
+			      struct kobj_attribute *_attr,
+			      const char *page, size_t count)
+{
+	struct cma_attribute *attr =
+		container_of(_attr, struct cma_attribute, kattr);
+	ssize_t len = attr->validate(page);
+	char *val = NULL;
+
+	if (len < 0)
+		return len;
+
+	if (len) {
+		val = kmemdup(page, len + 1, GFP_KERNEL);
+		if (!val)
+			return -ENOMEM;
+		val[len] = '\0';
+	}
+
+	mutex_lock(&cma_mutex);
+
+	kfree(attr->value);
+	attr->value = val;
+	attr->length = len;
+
+	mutex_unlock(&cma_mutex);
+
+	return count;
+}
+
+
+/*
+ * param        ::= [ rules [ ';' ] ]
+ * rules        ::= rule [ ';' rules ]
+ * rule         ::= patterns '=' regions
+ * patterns     ::= pattern [ ',' patterns ]
+ *
+ * regions      ::= reg-name [ ',' regions ] | '*'
+ *              // list of regions to try to allocate memory
+ *              // from for devices that match pattern
+ *
+ * pattern      ::= dev-pattern [ '/' kind-pattern ]
+ *                | '/' kind-pattern
+ *              // pattern request must match for this rule to
+ *              // apply to it; the first rule that matches is
+ *              // applied; if dev-pattern part is omitted
+ *              // value identical to the one used in previous
+ *              // rule is assumed
+ *
+ * See Documentation/contiguous-memory.txt for details.
+ *
+ * Example (white space added for convenience, forbidden in real string):
+ * cma_map = foo-dev = reg1;             -- foo-dev with no kind
+ *           bar-dev / firmware = reg3;  -- bar-dev's firmware
+ *           / * = reg2;                 -- bar-dev's all other kinds
+ *           baz-dev / * = reg1,reg2;    -- any kind of baz-dev
+ *           * / * = reg2,reg1;          -- any other allocations
+ */
+static ssize_t cma_map_validate(const char *param)
+{
+	const char *ch = param;
+
+	if (*ch == '\0' || *ch == '\n')
+		return 0;
+
+	for (;;) {
+		const char *start = ch;
+
+		while (*ch && *ch != '\n' && *ch != ';' && *ch != '=')
+			++ch;
+
+		if (*ch != '=' || start == ch) {
+			pr_err("map: expecting \"<patterns>=<regions>\" near %s\n", start);
+			return -EINVAL;
+		}
+
+		if (*ch == '*' && (ch[1] && ch[1] != '\n' && ch[1] != ';')) {
+			pr_err("map: end of ';' expecting after '*' near %s\n", start);
+			return -EINVAL;
+		}
+
+		while (*++ch != ';')
+			if (!*ch || *ch == '\n')
+				return ch - param;
+		if (ch[1] == '\0' || ch[1] == '\n')
+			return ch - param;
+		++ch;
+	}
+}
+
+
+static struct cma_attribute cma_attributes[] = {
+	{
+		.kattr		= __CMA_ATTR(map),
+		.validate	= &cma_map_validate
+	},
+	{
+		.kattr		= __CMA_ATTR(asterisk),
+		.validate	= (ssize_t (*)(const char *))&strlen,
+	},
+};
+
+#define cma_map       (cma_attributes[0].value)
+#define cma_asterisk  (cma_attributes[1].value)
+
+
+static int __init cma_attributes_init(void)
+{
+	struct cma_attribute *attr = cma_attributes;
+	int i = ARRAY_SIZE(cma_attributes), ret = 0;
+
+	do {
+		if (attr->value) {
+			/* Apply defaults.  Ignore parsing errors. */
+			const char *v = attr->value;
+			attr->value = NULL;
+			cma_attr_store(NULL, &attr->kattr, v, 0);
+		}
+
+		/*
+		 * Even if there was an error creating one file
+		 * initialise the rest.
+		 */
+		if (cma_kobj && !ret)
+			ret = sysfs_create_file(cma_kobj, &attr->kattr.attr);
+	} while (++attr, --i);
+
+	return ret;
+}
+
+
+
+/************************* Early parameter *************************/
+
+struct list_head cma_early_regions __initdata =
+	LIST_HEAD_INIT(cma_early_regions);
+
+/*
+ * regions-attr ::= [ regions [ ';' ] ]
+ * regions      ::= region [ ';' regions ]
+ *
+ * region       ::= [ '-' ] reg-name
+ *                    '=' size
+ *                  [ '@' start ]
+ *                  [ '/' alignment ]
+ *                  [ ':' alloc-name ]
+ *
+ * See Documentation/contiguous-memory.txt for details.
+ *
+ * Example:
+ * cma=reg1=64M:bf;reg2=32M@0x100000:bf;reg3=64M/1M:bf
+ *
+ * If allocator is ommited the first available allocater will be used.
+ */
+
+#define NUMPARSE(cond_ch, type, cond) ({				\
+		unsigned long long v = 0;				\
+		if (*param == (cond_ch)) {				\
+			const char *const msg = param + 1;		\
+			v = memparse(msg, &param);			\
+			if (!v || v > ~(type)0 || !(cond)) {		\
+				pr_err("param: invalid value near %s\n", msg); \
+				ret = -EINVAL;				\
+				break;					\
+			}						\
+		}							\
+		v;							\
+	})
+
+static int __init cma_param_parse(char *param)
+{
+	static int already_parsed __initdata;
+
+	static struct cma_region regions[16];
+	static char buffer[128] __initdata;
+
+	size_t left = ARRAY_SIZE(regions);
+	struct cma_region *reg = regions;
+
+	size_t buf_left = ARRAY_SIZE(buffer);
+	char *buf = buffer;
+
+	int ret = 0;
+
+	pr_debug("param: %s\n", param);
+
+	if (already_parsed)
+		return 0;
+	already_parsed = 1;
+
+	for (; *param; ++reg) {
+		dma_addr_t start, alignment;
+		size_t size;
+		int minus;
+
+		if (unlikely(!--left)) {
+			pr_err("param: too many early regions\n");
+			return -ENOSPC;
+		}
+
+		/* Parse name */
+		minus = *param == '-';
+		if (minus)
+			++param;
+
+		reg->name = param;
+		param = strchr(param, '=');
+		if (!param || param == reg->name) {
+			pr_err("param: expected \"<name>=\" near %s\n",
+			       reg->name);
+			ret = -EINVAL;
+			break;
+		}
+		*param = '\0';
+
+		/* Copy name to buffer */
+		if (!minus) {
+			size_t len = param - reg->name;
+			if (len >= buf_left) {
+				pr_err("param: names too long\n");
+				ret = -ENOSPC;
+				break;
+			}
+
+			memcpy(buf, reg->name, len);
+			buf += len;
+			*buf = ',';
+			++buf;
+			buf_left -= len;
+		}
+
+		/* Parse numbers */
+		size      = NUMPARSE('\0', size_t, true);
+		start     = NUMPARSE('@', dma_addr_t, true);
+		alignment = NUMPARSE('/', dma_addr_t, (v & (v - 1)) == 0);
+
+		alignment = max(alignment, (dma_addr_t)PAGE_SIZE);
+		start     = ALIGN(start, alignment);
+		size      = PAGE_ALIGN(size);
+		if (start + size < start) {
+			pr_err("param: invalid start, size combination\n");
+			ret = -EINVAL;
+			break;
+		}
+
+		/* Parse allocator */
+		if (*param == ':') {
+			reg->alloc_name = ++param;
+			while (*param && *param != ';')
+				++param;
+			if (param == reg->alloc_name)
+				reg->alloc_name = NULL;
+		}
+
+		/* Go to next */
+		if (*param == ';') {
+			*param = '\0';
+			++param;
+		} else if (*param) {
+			pr_err("param: expecting ';' or end of parameter near %s\n",
+			       param);
+			ret = -EINVAL;
+			break;
+		}
+
+		/* Add */
+		reg->size      = size;
+		reg->start     = start;
+		reg->alignment = alignment;
+		reg->copy_name = 1;
+
+		list_add_tail(&reg->list, &cma_early_regions);
+
+		pr_debug("param: registering early region %s (%p@%p/%p)\n",
+			 reg->name, (void *)reg->size, (void *)reg->start,
+			 (void *)reg->alignment);
+	}
+
+	if (buf != buffer) {
+		buf[-1] = '\0';
+		cma_asterisk = buffer;
+	}
+	return ret;
+}
+early_param("cma", cma_param_parse);
+
+#undef NUMPARSE
+
+
+int __init __must_check cma_early_region_register(struct cma_region *reg)
+{
+	dma_addr_t start, alignment;
+	size_t size;
+
+	if (reg->alignment & (reg->alignment - 1))
+		return -EINVAL;
+
+	alignment = max(reg->alignment, (dma_addr_t)PAGE_SIZE);
+	start     = ALIGN(reg->start, alignment);
+	size      = PAGE_ALIGN(reg->size);
+
+	if (start + size < start)
+		return -EINVAL;
+
+	reg->size      = size;
+	reg->start     = start;
+	reg->alignment = alignment;
+
+	list_add_tail(&reg->list, &cma_early_regions);
+
+	pr_debug("param: registering early region %s (%p@%p/%p)\n",
+		 reg->name, (void *)reg->size, (void *)reg->start,
+		 (void *)reg->alignment);
+
+	return 0;
+}
+
+
+
+/************************* Regions & Allocators *************************/
+
+static int __cma_region_attach_alloc(struct cma_region *reg);
+
+
+/* List of all regions.  Named regions are kept before unnamed. */
+static LIST_HEAD(cma_regions);
+
+#define cma_foreach_region(reg) \
+	list_for_each_entry(reg, &cma_regions, list)
+
+int __must_check cma_region_register(struct cma_region *reg)
+{
+	const char *name, *alloc_name;
+	struct cma_region *r;
+	char *ch = NULL;
+	int ret = 0;
+
+	if (!reg->size || reg->start + reg->size < reg->start)
+		return -EINVAL;
+
+	/* reg->users = 0; */
+	reg->used = 0;
+	reg->private_data = NULL;
+	reg->registered = 0;
+	reg->free_space = reg->size;
+
+	/* Copy name and alloc_name */
+	name = reg->name;
+	alloc_name = reg->alloc_name;
+	if (reg->copy_name && (reg->name || reg->alloc_name)) {
+		size_t name_size, alloc_size;
+
+		name_size  = reg->name       ? strlen(reg->name) + 1       : 0;
+		alloc_size = reg->alloc_name ? strlen(reg->alloc_name) + 1 : 0;
+
+		ch = kmalloc(name_size + alloc_size, GFP_KERNEL);
+		if (!ch) {
+			pr_err("%s: not enough memory to allocate name\n",
+			       reg->name ?: "(private)");
+			return -ENOMEM;
+		}
+
+		if (name_size) {
+			memcpy(ch, reg->name, name_size);
+			name = ch;
+			ch += name_size;
+		}
+
+		if (alloc_size) {
+			memcpy(ch, reg->alloc_name, alloc_size);
+			alloc_name = ch;
+		}
+	}
+
+	mutex_lock(&cma_mutex);
+
+	/* Don't let regions overlap */
+	cma_foreach_region(r)
+		if (r->start + r->size > reg->start &&
+		    r->start < reg->start + reg->size) {
+			ret = -EADDRINUSE;
+			goto done;
+		}
+
+	if (reg->alloc) {
+		ret = __cma_region_attach_alloc(reg);
+		if (unlikely(ret < 0))
+			goto done;
+	}
+
+	reg->name = name;
+	reg->alloc_name = alloc_name;
+	reg->registered = 1;
+	ch = NULL;
+
+	/*
+	 * Keep named at the beginning and unnamed (private) at the
+	 * end.  This helps in traversal when named region is looked
+	 * for.
+	 */
+	if (name)
+		list_add(&reg->list, &cma_regions);
+	else
+		list_add_tail(&reg->list, &cma_regions);
+
+done:
+	mutex_unlock(&cma_mutex);
+
+	pr_debug("%s: region %sregistered\n",
+		 reg->name ?: "(private)", ret ? "not " : "");
+	if (ch)
+		kfree(ch);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(cma_region_register);
+
+static struct cma_region *__must_check
+__cma_region_find(const char **namep)
+{
+	struct cma_region *reg;
+	const char *ch, *name;
+	size_t n;
+
+	for (ch = *namep; *ch && *ch != ',' && *ch != ';'; ++ch)
+		/* nop */;
+	name = *namep;
+	*namep = *ch ? ch : (ch + 1);
+	n = ch - name;
+
+	/*
+	 * Named regions are kept in front of unnamed so if we
+	 * encounter unnamed region we can stop.
+	 */
+	cma_foreach_region(reg)
+		if (!reg->name)
+			break;
+		else if (!strncmp(name, reg->name, n) && !reg->name[n])
+			return reg;
+
+	return NULL;
+}
+
+
+/* List of all allocators. */
+static LIST_HEAD(cma_allocators);
+
+#define cma_foreach_allocator(alloc) \
+	list_for_each_entry(alloc, &cma_allocators, list)
+
+int cma_allocator_register(struct cma_allocator *alloc)
+{
+	struct cma_region *reg;
+	int first;
+
+	if (!alloc->alloc || !alloc->free)
+		return -EINVAL;
+
+	/* alloc->users = 0; */
+
+	mutex_lock(&cma_mutex);
+
+	first = list_empty(&cma_allocators);
+
+	list_add_tail(&alloc->list, &cma_allocators);
+
+	/*
+	 * Attach this allocator to all allocator-less regions that
+	 * request this particular allocator (reg->alloc_name equals
+	 * alloc->name) or if region wants the first available
+	 * allocator and we are the first.
+	 */
+	cma_foreach_region(reg) {
+		if (reg->alloc)
+			continue;
+		if (reg->alloc_name
+		  ? alloc->name && !strcmp(alloc->name, reg->alloc_name)
+		  : (!reg->used && first))
+			continue;
+
+		reg->alloc = alloc;
+		__cma_region_attach_alloc(reg);
+	}
+
+	mutex_unlock(&cma_mutex);
+
+	pr_debug("%s: allocator registered\n", alloc->name ?: "(unnamed)");
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cma_allocator_register);
+
+static struct cma_allocator *__must_check
+__cma_allocator_find(const char *name)
+{
+	struct cma_allocator *alloc;
+
+	if (!name)
+		return list_empty(&cma_allocators)
+			? NULL
+			: list_entry(cma_allocators.next,
+				     struct cma_allocator, list);
+
+	cma_foreach_allocator(alloc)
+		if (alloc->name && !strcmp(name, alloc->name))
+			return alloc;
+
+	return NULL;
+}
+
+
+
+/************************* Initialise CMA *************************/
+
+int __init cma_set_defaults(char *regions, const char *map,
+                            const char *asterisk)
+{
+	if (map)
+		cma_map = map;
+
+	if (asterisk)
+		cma_asterisk = asterisk;
+
+	if (regions)
+		return cma_param_parse(regions);
+	else
+		return 0;
+}
+
+
+int __init cma_early_region_reserve(struct cma_region *reg)
+{
+	int tried = 0;
+
+	if (!reg->size || (reg->alignment & (reg->alignment - 1)) ||
+	    reg->reserved)
+		return -EINVAL;
+
+#ifndef CONFIG_NO_BOOTMEM
+
+	tried = 1;
+
+	{
+		void *ptr;
+
+		ptr = __alloc_bootmem_nopanic(reg->size, reg->alignment,
+					      reg->start);
+		if (ptr) {
+			reg->start = virt_to_phys(ptr);
+			reg->reserved = 1;
+			return 0;
+		}
+	}
+
+#endif
+
+#ifdef CONFIG_HAVE_MEMBLOCK
+
+	tried = 1;
+
+	if (reg->start) {
+		if (memblock_is_region_reserved(reg->start, reg->size) < 0 &&
+		    memblock_reserve(reg->start, reg->size) >= 0) {
+			reg->reserved = 1;
+			return 0;
+		}
+	} else {
+		/*
+		 * Use __memblock_alloc_base() since
+		 * memblock_alloc_base() panic()s.
+		 */
+		u64 ret = __memblock_alloc_base(reg->size, reg->alignment, 0);
+		if (ret &&
+		    ret < ~(dma_addr_t)0 &&
+		    ret + reg->size < ~(dma_addr_t)0 &&
+		    ret + reg->size > ret) {
+			reg->start = ret;
+			reg->reserved = 1;
+			return 0;
+		}
+
+		if (ret)
+			memblock_free(ret, reg->size);
+	}
+
+#endif
+
+	return tried ? -ENOMEM : -EOPNOTSUPP;
+}
+
+void __init cma_early_regions_reserve(int (*reserve)(struct cma_region *reg))
+{
+	struct cma_region *reg;
+
+	pr_debug("init: reserving early regions\n");
+
+	if (!reserve)
+		reserve = cma_early_region_reserve;
+
+	list_for_each_entry(reg, &cma_early_regions, list) {
+		if (reg->reserved) {
+			/* nothing */
+		} else if (reserve(reg) >= 0) {
+			pr_debug("init: %s: reserved %p@%p\n",
+				 reg->name ?: "(private)",
+				 (void *)reg->size, (void *)reg->start);
+			reg->reserved = 1;
+		} else {
+			pr_warn("init: %s: unable to reserve %p@%p/%p\n",
+				reg->name ?: "(private)",
+				(void *)reg->size, (void *)reg->start,
+				(void *)reg->alignment);
+		}
+	}
+}
+
+
+static int __init cma_init(void)
+{
+	struct cma_region *reg, *n;
+
+	pr_debug("init: initialising\n");
+
+	cma_kobj = kobject_create_and_add("contiguous", mm_kobj);
+	if (unlikely(!cma_kobj))
+		pr_err("init: unable to add sysfs directory\n");
+
+	cma_attributes_init();
+
+	list_for_each_entry_safe(reg, n, &cma_early_regions, list) {
+		INIT_LIST_HEAD(&reg->list);
+		/*
+		 * We don't care if there was an error.  It's a pity
+		 * but there's not much we can do about it any way.
+		 * If the error is on a region that was parsed from
+		 * command line then it will stay and waste a bit of
+		 * space; if it was registered using
+		 * cma_early_region_register() it's caller's
+		 * responsibility to do something about it.
+		 */
+		if (reg->reserved && cma_region_register(reg) < 0)
+			/* ignore error */;
+	}
+
+	INIT_LIST_HEAD(&cma_early_regions);
+
+	return 0;
+}
+__initcall(cma_init);
+
+
+
+/************************* Chunks *************************/
+
+/* All chunks sorted by start address. */
+static struct rb_root cma_chunks_by_start;
+
+static struct cma_chunk *__must_check __cma_chunk_find(dma_addr_t addr)
+{
+	struct cma_chunk *chunk;
+	struct rb_node *n;
+
+	for (n = cma_chunks_by_start.rb_node; n; ) {
+		chunk = rb_entry(n, struct cma_chunk, by_start);
+		if (addr < chunk->start)
+			n = n->rb_left;
+		else if (addr > chunk->start)
+			n = n->rb_right;
+		else
+			return chunk;
+	}
+	WARN("no chunk starting at %p\n", (void *)addr);
+	return NULL;
+}
+
+static int __must_check __cma_chunk_insert(struct cma_chunk *chunk)
+{
+	struct rb_node **new, *parent = NULL;
+	typeof(chunk->start) addr = chunk->start;
+
+	for (new = &cma_chunks_by_start.rb_node; *new; ) {
+		struct cma_chunk *c =
+			container_of(*new, struct cma_chunk, by_start);
+
+		parent = *new;
+		if (addr < c->start) {
+			new = &(*new)->rb_left;
+		} else if (addr > c->start) {
+			new = &(*new)->rb_right;
+		} else {
+			/*
+			 * We should never be here.  If we are it
+			 * means allocator gave us an invalid chunk
+			 * (one that has already been allocated) so we
+			 * refuse to accept it.  Our caller will
+			 * recover by freeing the chunk.
+			 */
+			WARN_ON(1);
+			return -EADDRINUSE;
+		}
+	}
+
+	rb_link_node(&chunk->by_start, parent, new);
+	rb_insert_color(&chunk->by_start, &cma_chunks_by_start);
+
+	return 0;
+}
+
+static void __cma_chunk_release(struct kref *ref)
+{
+	struct cma_chunk *chunk = container_of(ref, struct cma_chunk, ref);
+
+	rb_erase(&chunk->by_start, &cma_chunks_by_start);
+
+	chunk->reg->alloc->free(chunk);
+	/* --chunk->reg->users; */
+	chunk->reg->free_space += chunk->size;
+}
+
+
+
+/************************* The Device API *************************/
+
+static const char *__must_check
+__cma_where_from(const struct device *dev, const char *kind);
+
+
+/* Allocate. */
+
+static dma_addr_t __must_check
+__cma_alloc_from_region(struct cma_region *reg,
+			size_t size, dma_addr_t alignment)
+{
+	struct cma_chunk *chunk;
+
+	if (!reg || reg->free_space < size)
+		return -ENOMEM;
+
+	if (!reg->alloc) {
+		if (!reg->used)
+			__cma_region_attach_alloc(reg);
+		if (!reg->alloc)
+			return -ENOMEM;
+	}
+
+	chunk = reg->alloc->alloc(reg, size, alignment);
+	if (!chunk)
+		return -ENOMEM;
+
+	if (unlikely(__cma_chunk_insert(chunk) < 0)) {
+		/* We should *never* be here. */
+		chunk->reg->alloc->free(chunk);
+		kfree(chunk);
+		return -EADDRINUSE;
+	}
+
+	chunk->reg = reg;
+	kref_init(&chunk->ref);
+	/* ++reg->users; */
+	reg->free_space -= chunk->size;
+	pr_debug("allocated at %p\n", (void *)chunk->start);
+	return chunk->start;
+}
+
+dma_addr_t __must_check
+cma_alloc_from_region(struct cma_region *reg,
+		      size_t size, dma_addr_t alignment)
+{
+	dma_addr_t addr;
+
+	mutex_lock(&cma_mutex);
+
+	pr_debug("allocate %p/%p from %s\n",
+		 (void *)size, (void *)alignment,
+		 reg ? reg->name ?: "(private)" : "(null)");
+
+	if (!size || alignment & (alignment - 1) || !reg->registered)
+		addr = -EINVAL;
+	else
+		addr = __cma_alloc_from_region(reg, PAGE_ALIGN(size),
+					       max(alignment, (dma_addr_t)PAGE_SIZE));
+
+	mutex_unlock(&cma_mutex);
+
+	return addr;
+}
+EXPORT_SYMBOL_GPL(cma_alloc_from_region);
+
+dma_addr_t __must_check
+__cma_alloc(const struct device *dev, const char *kind,
+	    dma_addr_t size, dma_addr_t alignment)
+{
+	struct cma_region *reg;
+	const char *from;
+	dma_addr_t addr;
+
+	if (dev)
+		pr_debug("allocate %p/%p for %s/%s\n",
+			 (void *)size, (void *)alignment,
+			 dev_name(dev), kind ?: "");
+
+	if (!size || alignment & (alignment - 1))
+		return -EINVAL;
+
+	mutex_lock(&cma_mutex);
+
+	from = __cma_where_from(dev, kind);
+	if (unlikely(IS_ERR(from))) {
+		addr = PTR_ERR(from);
+		goto done;
+	}
+
+	pr_debug("allocate %p/%p from one of %s\n",
+		 (void *)size, (void *)alignment, from);
+
+	size = PAGE_ALIGN(size);
+	if (alignment < PAGE_SIZE)
+		alignment = PAGE_SIZE;
+
+	while (*from && *from != ';') {
+		reg = __cma_region_find(&from);
+		addr = __cma_alloc_from_region(reg, size, alignment);
+		if (!IS_ERR_VALUE(addr))
+			goto done;
+	}
+
+	pr_debug("not enough memory\n");
+	addr = -ENOMEM;
+
+done:
+	mutex_unlock(&cma_mutex);
+
+	return addr;
+}
+EXPORT_SYMBOL_GPL(__cma_alloc);
+
+
+/* Query information about regions. */
+int
+__cma_info(struct cma_info *infop, const struct device *dev, const char *kind)
+{
+	struct cma_info info = { ~(dma_addr_t)0, 0, 0, 0, 0 };
+	const char *from;
+	int ret;
+
+	if (unlikely(!infop))
+		return -EINVAL;
+
+	mutex_lock(&cma_mutex);
+
+	from = __cma_where_from(dev, kind);
+	if (IS_ERR(from)) {
+		ret = PTR_ERR(from);
+		info.lower_bound = 0;
+		goto done;
+	}
+
+	while (*from && *from != ';') {
+		struct cma_region *reg = __cma_region_find(&from);
+		if (reg) {
+			info.total_size += reg->size;
+			info.free_size += reg->free_space;
+			if (info.lower_bound > reg->start)
+				info.lower_bound = reg->start;
+			if (info.upper_bound < reg->start + reg->size)
+				info.upper_bound = reg->start + reg->size;
+			++info.count;
+		}
+	}
+
+	ret = 0;
+done:
+	mutex_unlock(&cma_mutex);
+	memcpy(infop, &info, sizeof info);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(__cma_info);
+
+
+/* kref handling. */
+
+int cma_get(dma_addr_t addr)
+{
+	struct cma_chunk *c;
+
+	mutex_lock(&cma_mutex);
+	c = __cma_chunk_find(addr);
+	if (c)
+		kref_get(&c->ref);
+	mutex_unlock(&cma_mutex);
+
+	pr_debug("get(%p): %sfound\n", (void *)addr, c ? "" : "not ");
+	return c ? 0 : -ENOENT;
+}
+EXPORT_SYMBOL_GPL(cma_get);
+
+int cma_put(dma_addr_t addr)
+{
+	struct cma_chunk *c;
+	int ret;
+
+	mutex_lock(&cma_mutex);
+	c = __cma_chunk_find(addr);
+	ret = c ? kref_put(&c->ref, __cma_chunk_release) : -ENOENT;
+	mutex_unlock(&cma_mutex);
+
+	pr_debug("put(%p): %sfound%s\n", (void *)addr, c ? "" : "not ",
+		 ret > 0 ? " and destroyed" : "");
+	return ret;
+}
+EXPORT_SYMBOL_GPL(cma_put);
+
+
+/************************* Miscellaneous *************************/
+
+static int __cma_region_attach_alloc(struct cma_region *reg)
+{
+	struct cma_allocator *alloc;
+	int ret;
+
+	/*
+	 * If reg->alloc is set then caller wants us to use this
+	 * allocator.  Otherwise we need to find one by name.
+	 */
+	if (reg->alloc) {
+		alloc = reg->alloc;
+	} else {
+		alloc = __cma_allocator_find(reg->alloc_name);
+		if (!alloc) {
+			pr_warn("init: %s: %s: no such allocator\n",
+				reg->name ?: "(private)",
+				reg->alloc_name ?: "(default)");
+			reg->used = 1;
+			return -ENOENT;
+		}
+	}
+
+	/* Try to initialise the allocator. */
+	reg->private_data = NULL;
+	ret = alloc->init ? alloc->init(reg) : 0;
+	if (unlikely(ret < 0)) {
+		pr_err("init: %s: %s: unable to initialise allocator\n",
+		       reg->name ?: "(private)", alloc->name ?: "(unnamed)");
+		reg->alloc = NULL;
+		reg->used = 1;
+	} else {
+		reg->alloc = alloc;
+		/* ++alloc->users; */
+		pr_debug("init: %s: %s: initialised allocator\n",
+			 reg->name ?: "(private)", alloc->name ?: "(unnamed)");
+	}
+	return ret;
+}
+
+
+/*
+ * s            ::= rules
+ * rules        ::= rule [ ';' rules ]
+ * rule         ::= patterns '=' [ regions ]
+ * patterns     ::= pattern [ ',' patterns ]
+ * pattern      ::= dev-pattern [ '/' kind-pattern ]
+ *                | '/' kind-pattern
+ */
+static const char *__must_check
+__cma_where_from(const struct device *dev, const char *kind)
+{
+	/*
+	 * This function matches the pattern from the map attribute
+	 * agains given device name and kind.  Kind may be of course
+	 * NULL or an emtpy string.
+	 */
+
+	const char *s, *name;
+	int name_matched = 0;
+
+	/*
+	 * If dev is NULL we were called in alternative form where
+	 * kind is the from string.  All we have to do is return it
+	 * unless it's NULL or "*" in which case we return
+	 * cma_asterisk.
+	 */
+	if (!dev) {
+		if (!kind || *kind == '*')
+			return cma_asterisk ?: ERR_PTR(-ENOENT);
+		else
+			return kind;
+	}
+
+	if (!cma_map)
+		return cma_asterisk ?: ERR_PTR(-ENOENT);
+
+	name = dev_name(dev);
+	if (WARN_ON(!name || !*name))
+		return ERR_PTR(-EINVAL);
+
+	if (!kind)
+		kind = "";
+
+	/*
+	 * Now we go throught the cma_map parameter.  It is what has
+	 * been provided by command line.
+	 */
+	for (s = cma_map; *s; ++s) {
+		const char *c;
+
+		/*
+		 * If the pattern starts with a slash, the device part of the
+		 * pattern matches if it matched previously.
+		 */
+		if (*s == '/') {
+			if (!name_matched)
+				goto look_for_next;
+			goto match_kind;
+		}
+
+		/*
+		 * We are now trying to match the device name.  This also
+		 * updates the name_matched variable.  If, while reading the
+		 * spec, we ecnounter comma it means that the pattern does not
+		 * match and we need to start over with another pattern (the
+		 * one afther the comma).  If we encounter equal sign we need
+		 * to start over with another rule.  If there is a character
+		 * that does not match, we neet to look for a comma (to get
+		 * another pattern) or semicolon (to get another rule) and try
+		 * again if there is one semowhere.
+		 */
+
+		name_matched = 0;
+
+		for (c = name; *s != '*' && *c; ++c, ++s)
+			if (*s == '=')
+				goto next_rule;
+			else if (*s == ',')
+				continue;
+			else if (*s != '?' && *c != *s)
+				goto look_for_next;
+		if (*s == '*')
+			++s;
+
+		name_matched = 1;
+
+		/*
+		 * Now we need to match the kind part of the pattern.  If the
+		 * pattern is missing it we match only if kind points to an
+		 * empty string.  Otherwise wy try to match it just like name.
+		 */
+		if (*s != '/') {
+			if (*kind)
+				goto look_for_next;
+		} else {
+match_kind:		/* s points to '/' */
+			++s;
+
+			for (c = kind; *s != '*' && *c; ++c, ++s)
+				if (*s == '=')
+					goto next_rule;
+				else if (*s == ',')
+					continue;
+				else if (*s != '?' && *c != *s)
+					goto look_for_next;
+			if (*s == '*')
+				++s;
+		}
+
+		/* Return the string behind the '=' sign of the rule. */
+		if (*s == '=' || *s == ',') {
+			s = strchr(s, '=') + 1;
+			return *s == '*' ? cma_asterisk ?: ERR_PTR(-ENOENT) : s;
+		}
+
+look_for_next:
+		do {
+			++s;
+		} while (*s != ',' && *s != '=');
+		if (*s == ',')
+			continue;
+
+next_rule:	/* s points to '=' */
+		s = strchr(s, ';');
+		if (!s)
+			break;
+	}
+
+	return ERR_PTR(-ENOENT);
+}
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-26 14:11     ` Michal Nazarewicz
  0 siblings, 0 replies; 57+ messages in thread
From: Michal Nazarewicz @ 2010-07-26 14:11 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-media, linux-arm-kernel, Hiremath Vaibhav,
	Marek Szyprowski, Pawel Osciak, Mark Brown, Daniel Walker,
	Jonathan Corbet, FUJITA Tomonori, Zach Pfeffer, Kyungmin Park,
	Michal Nazarewicz

The Contiguous Memory Allocator framework is a set of APIs for
allocating physically contiguous chunks of memory.

Various chips require contiguous blocks of memory to operate.  Those
chips include devices such as cameras, hardware video decoders and
encoders, etc.

The code is highly modular and customisable to suit the needs of
various users.  Set of regions reserved for CMA can be configured on
run-time and it is easy to add custom allocator algorithms if one
has such need.

For more details see Documentation/contiguous-memory.txt.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Pawel Osciak <p.osciak@samsung.com>
---
 Documentation/00-INDEX                             |    2 +
 .../ABI/testing/sysfs-kernel-mm-contiguous         |    9 +
 Documentation/contiguous-memory.txt                |  646 +++++++++++
 Documentation/kernel-parameters.txt                |    4 +
 include/linux/cma.h                                |  445 ++++++++
 mm/Kconfig                                         |   34 +
 mm/Makefile                                        |    3 +
 mm/cma-best-fit.c                                  |  407 +++++++
 mm/cma.c                                           | 1170 ++++++++++++++++++++
 9 files changed, 2720 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous
 create mode 100644 Documentation/contiguous-memory.txt
 create mode 100644 include/linux/cma.h
 create mode 100644 mm/cma-best-fit.c
 create mode 100644 mm/cma.c

diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
index 5405f7a..bb50209 100644
--- a/Documentation/00-INDEX
+++ b/Documentation/00-INDEX
@@ -94,6 +94,8 @@ connector/
 	- docs on the netlink based userspace<->kernel space communication mod.
 console/
 	- documentation on Linux console drivers.
+contiguous-memory.txt
+	- documentation on physically-contiguous memory allocation framework.
 cpu-freq/
 	- info on CPU frequency and voltage scaling.
 cpu-hotplug.txt
diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-contiguous b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
new file mode 100644
index 0000000..05e2f6a
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
@@ -0,0 +1,9 @@
+What:		/sys/kernel/mm/contiguous/
+Date:		July 2008
+Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
+Description:
+		/sys/kernel/mm/contiguous/ contains two files: asterisk and
+		map.  They are used to configure the Contiguous Memory
+		Allocator framework.
+
+		For details see Documentation/contiguous-memory.txt.
diff --git a/Documentation/contiguous-memory.txt b/Documentation/contiguous-memory.txt
new file mode 100644
index 0000000..6eb1295
--- /dev/null
+++ b/Documentation/contiguous-memory.txt
@@ -0,0 +1,646 @@
+                                                             -*- org -*-
+
+* Contiguous Memory Allocator
+
+   The Contiguous Memory Allocator (CMA) is a framework, which allows
+   setting up a machine-specific configuration for physically-contiguous
+   memory management. Memory for devices is then allocated according
+   to that configuration.
+
+   The main role of the framework is not to allocate memory, but to
+   parse and manage memory configurations, as well as to act as an
+   in-between between device drivers and pluggable allocators. It is
+   thus not tied to any memory allocation method or strategy.
+
+** Why is it needed?
+
+    Various devices on embedded systems have no scatter-getter and/or
+    IO map support and as such require contiguous blocks of memory to
+    operate.  They include devices such as cameras, hardware video
+    decoders and encoders, etc.
+
+    Such devices often require big memory buffers (a full HD frame is,
+    for instance, more then 2 mega pixels large, i.e. more than 6 MB
+    of memory), which makes mechanisms such as kmalloc() ineffective.
+
+    Some embedded devices impose additional requirements on the
+    buffers, e.g. they can operate only on buffers allocated in
+    particular location/memory bank (if system has more than one
+    memory bank) or buffers aligned to a particular memory boundary.
+
+    Development of embedded devices have seen a big rise recently
+    (especially in the V4L area) and many such drivers include their
+    own memory allocation code. Most of them use bootmem-based methods.
+    CMA framework is an attempt to unify contiguous memory allocation
+    mechanisms and provide a simple API for device drivers, while
+    staying as customisable and modular as possible.
+
+** Design
+
+    The main design goal for the CMA was to provide a customisable and
+    modular framework, which could be configured to suit the needs of
+    individual systems.  Configuration specifies a list of memory
+    regions, which then are assigned to devices.  Memory regions can
+    be shared among many device drivers or assigned exclusively to
+    one.  This has been achieved in the following ways:
+
+    1. The core of the CMA does not handle allocation of memory and
+       management of free space.  Dedicated allocators are used for
+       that purpose.
+
+       This way, if the provided solution does not match demands
+       imposed on a given system, one can develop a new algorithm and
+       easily plug it into the CMA framework.
+
+       The presented solution includes an implementation of a best-fit
+       algorithm.
+
+    2. CMA allows a run-time configuration of the memory regions it
+       will use to allocate chunks of memory from.  The set of memory
+       regions is given on command line so it can be easily changed
+       without the need for recompiling the kernel.
+
+       Each region has it's own size, alignment demand, a start
+       address (physical address where it should be placed) and an
+       allocator algorithm assigned to the region.
+
+       This means that there can be different algorithms running at
+       the same time, if different devices on the platform have
+       distinct memory usage characteristics and different algorithm
+       match those the best way.
+
+    3. When requesting memory, devices have to introduce themselves.
+       This way CMA knows who the memory is allocated for.  This
+       allows for the system architect to specify which memory regions
+       each device should use.
+
+       3a. Devices can also specify a "kind" of memory they want.
+           This makes it possible to configure the system in such
+           a way, that a single device may get memory from different
+           memory regions, depending on the "kind" of memory it
+           requested.  For example, a video codec driver might want to
+           allocate some shared buffers from the first memory bank and
+           the other from the second to get the highest possible
+           memory throughput.
+
+    4. For greater flexibility and extensibility, the framework allows
+       device drivers to register private regions of reserved memory
+       which then may be used only by them.
+
+       As an effect, if a driver would not use the rest of the CMA
+       interface, it can still use CMA allocators and other
+       mechanisms.
+
+       4a. Early in boot process, device drivers can also request the
+           CMA framework to a reserve a region of memory for them
+           which then will be used as a private region.
+
+           This way, drivers do not need to directly call bootmem,
+           memblock or similar early allocator but merely register an
+           early region and the framework will handle the rest
+           including choosing the right early allocator.
+
+** Use cases
+
+    Lets analyse some imaginary system that uses the CMA to see how
+    the framework can be used and configured.
+
+
+    We have a platform with a hardware video decoder and a camera each
+    needing 20 MiB of memory in worst case.  Our system is written in
+    such a way though that the two devices are never used at the same
+    time and memory for them may be shared.  In such a system the
+    following two configuration strings would be used:
+
+        static char regions[]  __initdata = "region=20M";
+        static char map[]      __initdata = "video,camera=region";
+
+        cma_set_defaults(regions, map, NULL);
+
+    .regions instructs CMA to allocate a region of 20 MiB and use the
+    first available memory allocator on it.  .map says that drivers
+    named "video" and "camera" are to be granted memory from the
+    previously defined region.
+
+    This can in fact be written in simpler way:
+
+        static       char regions[]  __initdata = "region=20M";
+
+        cma_set_defaults(regions, NULL, NULL);
+
+    By omitting the .map attribute, we say that all drivers are to use
+    all the regions defined with the first argument.  Generally, if
+    a device/kind pair is not matched by any rule from the .map it is
+    granted access to all regions from the .regions attribute.
+
+    We can see, that because the devices share the same region of
+    memory, we save 20 MiB of memory, compared to the situation when
+    each of the devices would reserve 20 MiB of memory for itself.
+
+
+    Now, let say that we have also many other smaller devices and we
+    want them to share some smaller pool of memory.  For instance 5
+    MiB.  This can be achieved in several ways:
+
+        static char regions[]  __initdata = "region=20M;common=5M";
+        static char map[]      __initdata =
+                "video,camera=region;*/*=common";
+
+        cma_set_defaults(regions, map, NULL);
+
+    This instructs CMA to reserve two regions and let video and camera
+    use region "region" whereas all other devices should use region
+    "common".  Alternatively, we might say:
+
+        static char regions[]  __initdata = "region=20M;common=5M";
+        static char map[]      __initdata = "video,camera=region";
+        static char asterisk[] __initdata = "common";
+
+        cma_set_defaults(regions, map, asterisk);
+
+    It works in similar fashion, expect by specifying the .asterisk
+    attribute, we say that devices not matched by any rule from .map
+    should access regions listed in .asterisk attribute.
+
+
+    Later on, after some development of the system, it can now run
+    video decoder and camera at the same time.  The 20 MiB region is
+    no longer enough for the two to share.  A quick fix can be made to
+    grant each of those devices separate regions:
+
+        static char regions[]  __initdata = "v=20M;c=20M;common=5M";
+        static char map[]      __initdata = "video=v;camera=c";
+        static char asterisk[] __initdata = "common";
+
+        cma_set_defaults(regions, map, asterisk);
+
+    This solution also shows how with CMA you can assign private pools
+    of memory to each device if that is required.
+
+    Allocation mechanisms can be replaced dynamically in a similar
+    manner as well. Let's say that during testing, it has been
+    discovered that, for a given shared region of 40 MiB,
+    fragmentation has become a problem.  It has been observed that,
+    after some time, it becomes impossible to allocate buffers of the
+    required sizes. So to satisfy our requirements, we would have to
+    reserve a larger shared region beforehand.
+
+    But fortunately, you have also managed to develop a new allocation
+    algorithm -- Neat Allocation Algorithm or "na" for short -- which
+    satisfies the needs for both devices even on a 30 MiB region.  The
+    configuration can be then quickly changed to:
+
+        static char regions[]  __initdata = "region=30M:na;common=5M";
+        static char map[]      __initdata = "video,camera=region";
+        static char asterisk[] __initdata = "common";
+
+        cma_set_defaults(regions, map, asterisk);
+
+    This shows how you can develop your own allocation algorithms if
+    the ones provided with CMA do not suit your needs and easily
+    replace them, without the need to modify CMA core or even
+    recompiling the kernel.
+
+** Technical Details
+
+*** The attributes
+
+    As shown above, CMA is configured by a set of attributes: regions,
+    map and asterisk.  The first one specifies regions that are to be
+    reserved for CMA.  The second one specifies what regions each
+    device is assigned to.  The third one specifies what regions
+    devices without rule in map should have access to.
+
+**** Regions
+
+     The format of the "regions" attribute is as follows:
+
+         regions-attr  ::= [ regions [ ';' ] ]
+         regions       ::= region [ ';' regions ]
+
+         region        ::= [ '-' ] REG-NAME
+                             '=' size
+                           [ '@' start ]
+                           [ '/' alignment ]
+                           [ ':' ALLOC-NAME ]
+
+         size          ::= MEMSIZE   // size of the region
+         start         ::= MEMSIZE   // desired start address of
+                                     // the region
+         alignment     ::= MEMSIZE   // alignment of the start
+                                     // address of the region
+
+     REG-NAME specifies the name of the region.  All regions given at
+     via the regions attribute need to have a name.  Moreover, all
+     regions need to have a unique name.  If two regions have the same
+     name it is unspecified which will be used when requesting to
+     allocate memory from region with given name.
+
+     ALLOC-NAME specifies the name of allocator to be used with the
+     region.  If no allocator name is provided, the "default"
+     allocator will be used with the region.  The "default" allocator
+     is, of course, the first allocator that has been registered. ;)
+
+     size, start and alignment are specified in bytes with suffixes
+     that memparse() accept.  If start is given, the region will be
+     reserved on given starting address (or at close to it as
+     possible).  If alignment is specified, the region will be aligned
+     to given value.
+
+     This can be also specified by the command line argument "cma".
+     If it is, platform's defaults won't be used.
+
+**** Map
+
+     The format of the "map" attribute is as follows:
+
+         map-attr      ::= [ rules [ ';' ] ]
+         rules         ::= rule [ ';' rules ]
+         rule          ::= patterns '=' regions
+
+         patterns      ::= pattern [ ',' patterns ]
+
+         regions       ::= REG-NAME [ ',' regions ] | '*'
+                       // list of regions to try to allocate memory
+                       // from
+
+         pattern       ::= dev-pattern [ '/' kind-pattern ]
+                       | '/' kind-pattern
+                       // pattern request must match for the rule to
+                       // apply; the first rule that matches is
+                       // applied; if dev-pattern part is omitted
+                       // value identical to the one used in previous
+                       // pattern is assumed
+
+         dev-pattern   ::= PATTERN-STR
+                       // pattern that device name must match for the
+                       // rule to apply.
+         kind-pattern  ::= PATTERN-STR
+                       // pattern that "kind" of memory (provided by
+                       // device) must match for the rule to apply.
+
+     The "map" attribute is a sequence of rules which specify what
+     regions should given (device, kind) pair use.  The first rule
+     that matches is applied.
+
+     For rule to match, the pattern must match (dev, kind) pair.
+     Pattern consist of the part before and after slash.  The first
+     part must match device name and the second part must match kind.
+
+     If the first part is empty, the device name is assumed to match
+     iff it matched in previous pattern.
+
+     Not specifying the second part matches only empty, or
+     not-specified kind.
+
+     Patterns may contain question marks which mach any characters and
+     end with an asterisk which match the rest of the string
+     (including nothing).
+
+     The '*' as the list of regions means to use all the regions
+     listed in the "asterisk" attribute.  Moreover, if no rule matches
+     a device the list of regions from the "asterisk" attribute is
+     used as well (ie. a "*/*=*" rule is assumed at the end).
+
+     This attribute is accessible via SysFS and can be changed at
+     run-time by writing to /sys/kernel/mm/contiguous/map.
+
+**** Asterisk
+
+     The format of the "asterisk" attribute is as follows:
+
+         asterisk-attr ::= [ regions ]
+         regions       ::= REG-NAME [ ',' regions ]
+
+     If no rule from "map" attribute match given driver, the regions
+     specified by the "asterisk" attribute are used.
+
+     If this attribute is not specified by call to cma_set_defaults()
+     it is built automatically from the regions listed in the
+     "regions" rule.  All regions are included expect for the ones
+     which name is prefixed by a minus sign.
+
+     For instance, the following:
+
+         static char regions[]  __initdata = "region=20M;common=5M";
+         static char map[]      __initdata = "video,camera=region";
+         static char asterisk[] __initdata = "common";
+
+     can be replaced by:
+
+         static char regions[]  __initdata = "-region=20M;common=5M";
+         static char map[]      __initdata = "video,camera=region";
+
+     The biggest difference is that if user specifies the "cma"
+     command line parameter and platform did not specifies the
+     "asterisk" attribute it will be built from the user-supplied
+     "cma" parameter.  If "asterisk" attribute were to be specified by
+     platform code, it would not be created from command line
+     parameter.
+
+     This attribute is accessible via SysFS and can be changed at
+     run-time by writing to /sys/kernel/mm/contiguous/asterisk.
+
+**** Examples
+
+     Some examples (whitespace added for better readability):
+
+         cma = r1 = 64M       // 64M region
+                    @512M       // starting at address 512M
+                                // (or at least as near as possible)
+                    /1M         // make sure it's aligned to 1M
+                    :foo(bar);  // uses allocator "foo" with "bar"
+                                // as parameters for it
+               r2 = 64M       // 64M region
+                    /1M;        // make sure it's aligned to 1M
+                                // uses the first available allocator
+               r3 = 64M       // 64M region
+                    @512M       // starting at address 512M
+                    :foo;       // uses allocator "foo" with no parameters
+
+         cma_map = foo = r1;
+                       // device foo with kind==NULL uses region r1
+
+                   foo/quaz = r2;  // OR:
+                   /quaz = r2;
+                       // device foo with kind == "quaz" uses region r2
+
+                   foo/* = r3;     // OR:
+                   /* = r3;
+                       // device foo with any other kind uses region r3
+
+                   bar/* = r1,r2;
+                       // device bar with any kind uses region r1 or r2
+
+                   baz?/a* , baz?/b* = r3;
+                       // devices named baz? where ? is any character
+                       // with kind being a string starting with "a" or
+                       // "b" use r3
+
+*** The device and kind of memory
+
+    The name of the device is taken form the device structure.  It is
+    not possible to use CMA if driver does not register a device
+    (actually this can be overcome if a fake device structure is
+    provided with at least the name set).
+
+    The kind of memory is an optional argument provided by the device
+    whenever it requests memory chunk.  In many cases this can be
+    ignored but sometimes it may be required for some devices.
+
+    For instance, let say that there are two memory banks and for
+    performance reasons a device uses buffers in both of them.  In
+    such case, the device driver would define two kinds and use it for
+    different buffers.  CMA attributes could look as follows:
+
+         static char regions[]  __initdata = "a=32M@0;b=32M@512M";
+         static char map[]      __initdata = "foo/a=a;foo/b=b";
+
+    And whenever the driver allocated the memory it would specify the
+    kind of memory:
+
+        buffer1 = cma_alloc(dev, "a", 1 << 20, 0);
+        buffer2 = cma_alloc(dev, "b", 1 << 20, 0);
+
+    If it was needed to try to allocate from the other bank as well if
+    the dedicated one is full, the attributes could be changed to:
+
+         static char regions[]  __initdata = "a=32M@0;b=32M@512M";
+         static char map[]      __initdata = "foo/a=a,b;foo/b=b,a";
+
+    On the other hand, if the same driver was used on a system with
+    only one bank, the command line could be changed to:
+
+         static char regions[]  __initdata = "r=64";
+         static char map[]      __initdata = "foo/*=r";
+
+    without the need to change the driver at all.
+
+*** Device API
+
+    There are four calls provided by the CMA framework to devices.  To
+    allocate a chunk of memory cma_alloc() function needs to be used:
+
+        dma_addr_t cma_alloc(const struct device *dev, const char *kind,
+                             size_t size, dma_addr_t alignment);
+
+    If required, device may specify alignment in bytes that the chunk
+    need to satisfy.  It have to be a power of two or zero.  The
+    chunks are always aligned at least to a page.
+
+    The kind specifies the kind of memory as described to in the
+    previous subsection.  If device driver does not use notion of
+    memory kinds it's safe to pass NULL as the kind.
+
+    The basic usage of the function is just a:
+
+        addr = cma_alloc(dev, NULL, size, 0);
+
+    The function returns physical address of allocated chunk or
+    a value that evaluated true if checked with IS_ERR_VALUE(), so the
+    correct way for checking for errors is:
+
+        unsigned long addr = cma_alloc(dev, size);
+        if (IS_ERR_VALUE(addr))
+                return (int)addr;
+        /* Allocated */
+
+    (Make sure to include <linux/err.h> which contains the definition
+    of the IS_ERR_VALUE() macro.)
+
+
+    Allocated chunk is freed via a cma_put() function:
+
+        int cma_put(dma_addr_t addr);
+
+    It takes physical address of the chunk as an argument and
+    decreases it's reference counter.  If the counter reaches zero the
+    chunk is freed.  Most of the time users do not need to think about
+    reference counter and simply use the cma_put() as a free call.
+
+    If one, however, were to share a chunk with others built in
+    reference counter may turn out to be handy.  To increment it, one
+    needs to use cma_get() function:
+
+        int cma_get(dma_addr_t addr);
+
+
+    The last function is the cma_info() which returns information
+    about regions assigned to given (dev, kind) pair.  Its syntax is:
+
+        int cma_info(struct cma_info *info,
+                     const struct device *dev,
+                     const char *kind);
+
+    On successful exit it fills the info structure with lower and
+    upper bound of regions, total size and number of regions assigned
+    to given (dev, kind) pair.
+
+**** Dynamic and private regions
+
+     In the basic setup, regions are provided via command line argument
+     and then allocated by platform initialisation code (which usually
+     calls cma_regions_allocate() function).
+
+     It is, however, possible to create and add regions dynamically
+     using cma_region_register() function.
+
+         int cma_region_register(struct cma_region *reg);
+
+     The region does not have to have name.  If it does not, it won't
+     be accessed via standard mapping (the one provided with cma_map
+     parameter).  Such regions are private and to allocate chunk on
+     them, one needs to call:
+
+         dma_addr_t cma_alloc_from_region(struct cma_region *reg,
+                                          size_t size, dma_addr_t alignment);
+
+     It is just like cma_alloc() expect one specifies what region to
+     allocate memory from.  The region must have been registered.
+
+**** Allocating from region specified by name
+
+     If a driver preferred allocating from a region or list of regions
+     it knows name of it can use a different call simmilar to the
+     previous:
+
+         dma_addr_t cma_alloc_from(const char *regions,
+                                   size_t size, dma_addr_t alignment);
+
+     The first argument is a comma-separated list of regions the
+     driver desires CMA to try and allocate from.  The list is
+     terminated by NUL byte or a semicolon.
+
+     Similarly, there is a call for requesting information about named
+     regions:
+
+        int cma_info_about(struct cma_info *info, const char *regions);
+
+     Generally, it should not be needed to use those interfaces but
+     they are provided nevertheless.
+
+**** Registering early regions
+
+     An early region is a region that is managed by CMA early during
+     boot process.  It's platforms responsibility to reserve memory
+     for early regions.  Later on, when CMA initialises early regions
+     with reserved memory are registered as normal regions.
+     Registering an early region may be a way for a device to request
+     a private pool of memory without worrying about actually
+     reserving the memory:
+
+         int cma_early_region_register(struct cma_region *reg);
+
+     This needs to be done quite early on in boot process, before
+     platform traverses the cma_early_regions list to reserve memory.
+
+     When boot process ends, device driver may see whether the region
+     was reserved (by checking reg->reserved flag) and if so, whether
+     it was successfully registered as a normal region (by checking
+     the reg->registered flag).  If that is the case, device driver
+     can use normal API calls to use the region.
+
+*** Allocator operations
+
+    Creating an allocator for CMA needs four functions to be
+    implemented.
+
+
+    The first two are used to initialise an allocator far given driver
+    and clean up afterwards:
+
+        int  cma_foo_init(struct cma_region *reg);
+        void cma_foo_done(struct cma_region *reg);
+
+    The first is called when allocater is attached to region.  The
+    cma_region structure has saved starting address of the region as
+    well as its size.  Any data that allocate associated with the
+    region can be saved in private_data field.
+
+    The second call cleans up and frees all resources the allocator
+    has allocated for the region.  The function can assume that all
+    chunks allocated form this region have been freed thus the whole
+    region is free.
+
+
+    The two other calls are used for allocating and freeing chunks.
+    They are:
+
+        struct cma_chunk *cma_foo_alloc(struct cma_region *reg,
+                                        size_t size, dma_addr_t alignment);
+        void cma_foo_free(struct cma_chunk *chunk);
+
+    As names imply the first allocates a chunk and the other frees
+    a chunk of memory.  It also manages a cma_chunk object
+    representing the chunk in physical memory.
+
+    Either of those function can assume that they are the only thread
+    accessing the region.  Therefore, allocator does not need to worry
+    about concurrency.  Moreover, all arguments are guaranteed to be
+    valid (i.e. page aligned size, a power of two alignment no lower
+    the a page size).
+
+
+    When allocator is ready, all that is left is to register it by
+    calling cma_allocator_register() function:
+
+            int cma_allocator_register(struct cma_allocator *alloc);
+
+    The argument is an structure with pointers to the above functions
+    and allocator's name.  The whole call may look something like
+    this:
+
+        static struct cma_allocator alloc = {
+                .name    = "foo",
+                .init    = cma_foo_init,
+                .cleanup = cma_foo_cleanup,
+                .alloc   = cma_foo_alloc,
+                .free    = cma_foo_free,
+        };
+        return cma_allocator_register(&alloc);
+
+    The name ("foo") will be available to use with command line
+    argument.
+
+*** Integration with platform
+
+    There is one function that needs to be called form platform
+    initialisation code.  That is the cma_early_regions_reserve()
+    function:
+
+        void cma_early_regions_reserve(int (*reserve)(struct cma_region *reg));
+
+    It traverses list of all of the regions given on command line and
+    reserves memory for them.  The only argument is a callback
+    function used to reserve the region.  Passing NULL as the argument
+    makes the function use cma_early_region_reserve() function which
+    uses bootmem and memblock for allocating.
+
+    Alternatively, platform code could traverse the cma_early_regions
+    list by itself but this should not be necessary.
+
+
+    Platform has also a way of providing default attributes for CMA,
+    cma_set_defaults() function is used for that purpose:
+
+        int __init cma_set_defaults(char *regions, const char *map,
+                                    const char *asterisk);
+
+    It needs to be called after early params have been parsed but
+    prior to reserving regions.  It let one specify the three
+    attributes used to configure CMA.  Any of the pointer may point to
+    a string in __initdata section.  The regions argument must be
+    modifyable though.
+
+** Future work
+
+    In the future, implementation of mechanisms that would allow the
+    free space inside the regions to be used as page cache, filesystem
+    buffers or swap devices is planned.  With such mechanisms, the
+    memory would not be wasted when not used.
+
+    Because all allocations and freeing of chunks pass the CMA
+    framework it can follow what parts of the reserved memory are
+    freed and what parts are allocated.  Tracking the unused memory
+    would let CMA use it for other purposes such as page cache, I/O
+    buffers, swap, etc.
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index a698255..5c7af89 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -43,6 +43,7 @@ parameter is applicable:
 	AVR32	AVR32 architecture is enabled.
 	AX25	Appropriate AX.25 support is enabled.
 	BLACKFIN Blackfin architecture is enabled.
+	CMA	Contiguous Memory Allocator is enabled.
 	DRM	Direct Rendering Management support is enabled.
 	EDD	BIOS Enhanced Disk Drive Services (EDD) is enabled
 	EFI	EFI Partitioning (GPT) is enabled
@@ -476,6 +477,9 @@ and is between 256 and 4096 characters. It is defined in the file
 			Also note the kernel might malfunction if you disable
 			some critical bits.
 
+	cma=		[CMA] List of CMA regions.
+			See Documentation/contiguous-memory.txt for details.
+
 	cmo_free_hint=	[PPC] Format: { yes | no }
 			Specify whether pages are marked as being inactive
 			when they are freed.  This is used in CMO environments
diff --git a/include/linux/cma.h b/include/linux/cma.h
new file mode 100644
index 0000000..96a8652
--- /dev/null
+++ b/include/linux/cma.h
@@ -0,0 +1,445 @@
+#ifndef __LINUX_CMA_H
+#define __LINUX_CMA_H
+
+/*
+ * Contiguous Memory Allocator framework
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ */
+
+/*
+ * See Documentation/contiguous-memory.txt for details.
+ */
+
+/***************************** Kernel lever API *****************************/
+
+#ifdef __KERNEL__
+
+#include <linux/kref.h>
+#include <linux/rbtree.h>
+#include <linux/list.h>
+
+
+struct device;
+struct cma_info;
+
+/*
+ * Don't call it directly, use cma_alloc(), cma_alloc_from() or
+ * cma_alloc_from_region().
+ */
+dma_addr_t __must_check
+__cma_alloc(const struct device *dev, const char *kind,
+	    size_t size, dma_addr_t alignment);
+
+/* Don't call it directly, use cma_info() or cma_info_about(). */
+int
+__cma_info(struct cma_info *info, const struct device *dev, const char *kind);
+
+
+/**
+ * cma_alloc - allocates contiguous chunk of memory.
+ * @dev:	The device to perform allocation for.
+ * @kind:	A kind of memory to allocate.  A device may use several
+ * 		different kinds of memory which are configured
+ * 		separately.  Usually it's safe to pass NULL here.
+ * @size:	Size of the memory to allocate in bytes.
+ * @alignment:	Desired alignment in bytes.  Must be a power of two or
+ * 		zero.  If alignment is less then a page size it will be
+ * 		set to page size. If unsure, pass zero here.
+ *
+ * On error returns a negative error cast to dma_addr_t.  Use
+ * IS_ERR_VALUE() to check if returned value is indeed an error.
+ * Otherwise physical address of the chunk is returned.
+ */
+static inline dma_addr_t __must_check
+cma_alloc(const struct device *dev, const char *kind,
+	  size_t size, dma_addr_t alignment)
+{
+	return dev ? -EINVAL : __cma_alloc(dev, kind, size, alignment);
+}
+
+
+/**
+ * struct cma_info - information about regions returned by cma_info().
+ * @lower_bound:	The smallest address that is possible to be
+ * 			allocated for given (dev, kind) pair.
+ * @upper_bound:	The one byte after the biggest address that is
+ * 			possible to be allocated for given (dev, kind)
+ * 			pair.
+ * @total_size:	Total size of regions mapped to (dev, kind) pair.
+ * @free_size:	Total free size in all of the regions mapped to (dev, kind)
+ * 		pair.  Because of possible race conditions, it is not
+ * 		guaranteed that the value will be correct -- it gives only
+ * 		an approximation.
+ * @count:	Number of regions mapped to (dev, kind) pair.
+ */
+struct cma_info {
+	dma_addr_t lower_bound, upper_bound;
+	size_t total_size, free_size;
+	unsigned count;
+};
+
+/**
+ * cma_info - queries information about regions.
+ * @info:	Pointer to a structure where to save the information.
+ * @dev:	The device to query information for.
+ * @kind:	A kind of memory to query information for.
+ * 		If unsure, pass NULL here.
+ *
+ * On error returns a negative error, zero otherwise.
+ */
+static inline int
+cma_info(struct cma_info *info, const struct device *dev, const char *kind)
+{
+	return dev ? -EINVAL : __cma_info(info, dev, kind);
+}
+
+
+/**
+ * cma_get - increases reference counter of a chunk.
+ * @addr:	Beginning of the chunk.
+ *
+ * Returns zero on success or -ENOENT if there is no chunk at given
+ * location.  In the latter case issues a warning and a stacktrace.
+ */
+int cma_get(dma_addr_t addr);
+
+/**
+ * cma_put - decreases reference counter of a chunk.
+ * @addr:	Beginning of the chunk.
+ *
+ * Returns one if the chunk has been freed, zero if it hasn't, and
+ * -ENOENT if there is no chunk at given location.  In the latter case
+ * issues a warning and a stacktrace.
+ *
+ * If this function returns zero, you still can not count on the area
+ * remaining in memory.  Only use the return value if you want to see
+ * if the area is now gone, not present.
+ */
+int cma_put(dma_addr_t addr);
+
+
+
+/****************************** Lower lever API *****************************/
+
+/**
+ * cma_alloc_from - allocates contiguous chunk of memory from named regions.
+ * @regions:	Comma separated list of region names.  Terminated by NUL
+ * 		byte or a semicolon.  "*" or NULL means to try all regions
+ * 		which are listed as asterisk regions.
+ * @size:	Size of the memory to allocate in bytes.
+ * @alignment:	Desired alignment in bytes.  Must be a power of two or
+ * 		zero.  If alignment is less then a page size it will be
+ * 		set to page size. If unsure, pass zero here.
+ *
+ * On error returns a negative error cast to dma_addr_t.  Use
+ * IS_ERR_VALUE() to check if returned value is indeed an error.
+ * Otherwise physical address of the chunk is returned.
+ */
+static inline dma_addr_t __must_check
+cma_alloc_from(const char *regions, size_t size, dma_addr_t alignment)
+{
+	return __cma_alloc(NULL, regions, size, alignment);
+}
+
+/**
+ * cma_info_about - queries information about named regions.
+ * @info:	Pointer to a structure where to save the information.
+ * @regions:	Comma separated list of region names.  Terminated by NUL
+ * 		byte or a semicolon.
+ *
+ * On error returns a negative error, zero otherwise.
+ */
+static inline int
+cma_info_about(struct cma_info *info, const const char *regions)
+{
+	return __cma_info(info, NULL, regions);
+}
+
+
+
+struct cma_allocator;
+
+/**
+ * struct cma_region - a region reserved for CMA allocations.
+ * @name:	Unique name of the region.  Read only.
+ * @start:	Physical starting address of the region in bytes.  Always
+ * 		aligned at least to a full page.  Read only.
+ * @size:	Size of the region in bytes.  Multiply of a page size.
+ * 		Read only.
+ * @free_space:	Free space in the region.  Read only.
+ * @alignment:	Desired alignment of the region in bytes.  A power of two,
+ * 		always at least page size.  Early.
+ * @alloc:	Allocator used with this region.  NULL means allocator is
+ * 		not attached.  Private.
+ * @alloc_name:	Allocator name read from cmdline.  Private.  This may be
+ * 		different from @alloc->name.
+ * @private_data:	Allocator's private data.
+ * @used:	Whether region was already used, ie. there was at least
+ * 		one allocation request for.  Private.
+ * @list:	Entry in list of regions.  Private.
+ * @registered:	Whenthe this region has been registered.  Read only.
+ * @reserved:	Whether this region has been reserved.  Early.  Read only.
+ * @copy_name:	Whether @name and @alloc_name needs to be copied when
+ * 		this region is converted from early to normal.  Early.
+ * 		Private.
+ *
+ * Regions come in two types: an early region and normal region.  The
+ * former can be reserved or not-reserved.  Fields marked as "early"
+ * are only meaningful in early regions.
+ *
+ * Early regions are important only during initialisation.  The list
+ * of early regions is built from the "cma" command line argument or
+ * platform defaults.  Platform initialisation code is responsible for
+ * reserving space for unreserved regions that are placed on
+ * cma_early_regions list.
+ *
+ * Later, during CMA initialisation all reserved regions from the
+ * cma_early_regions list are registered as normal regions and can be
+ * used using standard mechanisms.
+ */
+ /* * @users:	Number of chunks allocated in this region. */
+struct cma_region {
+	const char *name;
+	dma_addr_t start;
+	size_t size;
+	union {
+		size_t free_space;	/* Normal region */
+		dma_addr_t alignment;	/* Early region */
+	};
+
+	struct cma_allocator *alloc;
+	const char *alloc_name;
+	union {
+		void *private_data;	/* Normal region w/ allocator */
+		unsigned used;		/* Normal regien w/o allocator */
+	};
+
+	/* unsigned users; */
+	struct list_head list;
+
+	unsigned registered:1;
+	unsigned reserved:1;
+	unsigned copy_name:1;
+};
+
+
+/**
+ * cma_region_register() - registers a region.
+ * @reg:	Region to add.
+ *
+ * Region's start and size must be set.
+ *
+ * If name is set the region will be accessible using normal mechanism
+ * like mapping or cma_alloc_from() function otherwise it will be
+ * a private region and accessible only using the
+ * cma_alloc_from_region() function.
+ *
+ * If alloc is set function will try to initialise given allocator
+ * (and will return error if it failes).  Otherwise alloc_name may
+ * point to a name of an allocator to use (if not set, the default
+ * will be used).
+ *
+ * All other fields are ignored and/or overwritten.
+ *
+ * Returns zero or negative error.  In particular, -EADDRINUSE if
+ * region overlap with already existing region.
+ */
+int __must_check cma_region_register(struct cma_region *reg);
+
+
+/**
+ * cma_alloc_from_region() - allocates contiguous chunk of memory from region.
+ * @reg:	Region to allocate chunk from.
+ * @size:	Size of the memory to allocate in bytes.
+ * @alignment:	Desired alignment in bytes.  Must be a power of two or
+ * 		zero.  If alignment is less then a page size it will be
+ * 		set to page size. If unsure, pass zero here.
+ *
+ * On error returns a negative error cast to dma_addr_t.  Use
+ * IS_ERR_VALUE() to check if returned value is indeed an error.
+ * Otherwise physical address of the chunk is returned.
+ */
+dma_addr_t __must_check
+cma_alloc_from_region(struct cma_region *reg,
+		      size_t size, dma_addr_t alignment);
+
+
+
+/****************************** Allocators API ******************************/
+
+/**
+ * struct cma_chunk - an allocated contiguous chunk of memory.
+ * @start:	Physical address in bytes.
+ * @size:	Size in bytes.
+ * @free_space:	Free space in region in bytes.  Read only.
+ * @reg:	Region this chunk belongs to.
+ * @kref:	Number of references.  Private.
+ * @by_start:	A node in an red-black tree with all chunks sorted by
+ * 		start address.
+ *
+ * The cma_allocator::alloc() operation need to set only the @start
+ * and @size fields.  The rest is handled by the caller (ie. CMA
+ * glue).
+ */
+struct cma_chunk {
+	dma_addr_t start;
+	size_t size;
+
+	struct cma_region *reg;
+	struct kref ref;
+	struct rb_node by_start;
+};
+
+
+/**
+ * struct cma_allocator - a CMA allocator.
+ * @name:	Allocator's unique name
+ * @init:	Initialises an allocator on given region.
+ * @cleanup:	Cleans up after init.  May assume that there are no chunks
+ * 		allocated in given region.
+ * @alloc:	Allocates a chunk of memory of given size in bytes and
+ * 		with given alignment.  Alignment is a power of
+ * 		two (thus non-zero) and callback does not need to check it.
+ * 		May also assume that it is the only call that uses given
+ * 		region (ie. access to the region is synchronised with
+ * 		a mutex).  This has to allocate the chunk object (it may be
+ * 		contained in a bigger structure with allocator-specific data.
+ * 		Required.
+ * @free:	Frees allocated chunk.  May also assume that it is the only
+ * 		call that uses given region.  This has to free() the chunk
+ * 		object as well.  Required.
+ * @list:	Entry in list of allocators.  Private.
+ */
+ /* * @users:	How many regions use this allocator.  Private. */
+struct cma_allocator {
+	const char *name;
+
+	int (*init)(struct cma_region *reg);
+	void (*cleanup)(struct cma_region *reg);
+	struct cma_chunk *(*alloc)(struct cma_region *reg, size_t size,
+				   dma_addr_t alignment);
+	void (*free)(struct cma_chunk *chunk);
+
+	/* unsigned users; */
+	struct list_head list;
+};
+
+
+/**
+ * cma_allocator_register() - Registers an allocator.
+ * @alloc:	Allocator to register.
+ *
+ * Adds allocator to the list of allocators managed by CMA.
+ *
+ * All of the fields of cma_allocator structure must be set except for
+ * optional name and users and list which will be overriden.
+ *
+ * Returns zero or negative error code.
+ */
+int cma_allocator_register(struct cma_allocator *alloc);
+
+
+/**************************** Initialisation API ****************************/
+
+/**
+ * cma_set_defaults() - specifies default command line parameters.
+ * @regions:	Default regions attribute.  May be overriden by "cma"
+ * 		command line argument.
+ * @map:	Default map attribute.  If not set all devices will use
+ * 		regions specified by @asterisk attribute.
+ * @asterisk:	Default asterisk attribute.  If not set will be built from
+ * 		@regions.  Specifies list of regions to use for devices
+ * 		that are missing a rule in @map.
+ *
+ * This function should be called prior to cma_early_regions_reserve()
+ * and after early parameters have been parsed.  The arguments
+ * specifies defalts for various CMA attributes.
+ *
+ * All the pointers can point to strings in __initdata.  @regions
+ * needs to be modifyable.
+ *
+ * Returns negative error code if there was an error parsing the
+ * defs->regions parameter.  Zero otherwise.
+ */
+int __init cma_set_defaults(char *regions, const char *map,
+                            const char *asterisk);
+
+
+/**
+ * cma_early_regions - a list of early regions.
+ *
+ * Platform needs to allocate space for each of the region before
+ * initcalls are executed.  If space is reserved, the reserved flag
+ * must be set.  Platform initialisation code may choose to use
+ * cma_early_regions_allocate().
+ *
+ * Later, during CMA initialisation all reserved regions from the
+ * cma_early_regions list are registered as normal regions and can be
+ * used using standard mechanisms.
+ */
+extern struct list_head cma_early_regions __initdata;
+
+
+/**
+ * cma_early_region_register() - registers an early region.
+ * @reg:	Region to add.
+ *
+ * Region's start, size and alignment must be set.
+ *
+ * If name is set the region will be accessible using normal mechanism
+ * like mapping or cma_alloc_from() function otherwise it will be
+ * a private region accessible only using the cma_alloc_from_region().
+ *
+ * If alloc is set function will try to initialise given allocator
+ * when the early region is "converted" to normal region and
+ * registered during CMA initialisation.  If this failes, the space
+ * will still be reserved but the region won't be registered.
+ *
+ * As usually, alloc_name may point to a name of an allocator to use
+ * (if both alloc and alloc_name aret set, the default will be used).
+ *
+ * All other fields are ignored and/or overwritten.
+ *
+ * Returns zero or negative error.  No checking if regions overlap is
+ * performed.
+ */
+int __init __must_check cma_early_region_register(struct cma_region *reg);
+
+
+/**
+ * cma_early_region_reserve() - reserves a physically contiguous memory region.
+ * @reg:	Early region to reserve memory for.
+ *
+ * If platform supports bootmem this is the first allocator this
+ * function tries to use.  If that failes (or bootmem is not
+ * supported) function tries to use memblec if it is available.
+ *
+ * On success sets reg->reserved flag.
+ *
+ * Returns zero or negative error.
+ */
+int __init cma_early_region_reserve(struct cma_region *reg);
+
+/**
+ * cma_early_regions_reserver() - helper function for reserving early regions.
+ * @reserve:	Callbac function used to reserve space for region.  Needs
+ * 		to return non-negative if allocation succeeded, negative
+ * 		error otherwise.  NULL means cma_early_region_alloc() will
+ * 		be used.
+ *
+ * This function traverses the %cma_early_regions list and tries to
+ * reserve memory for each early region.  It uses the @reserve
+ * callback function for that purpose.  The reserved flag of each
+ * region is updated accordingly.
+ */
+void __init cma_early_regions_reserve(int (*reserve)(struct cma_region *reg));
+
+#else
+
+#define cma_defaults(regions, map, asterisk) ((int)0)
+#define cma_early_regions_reserve(reserve)   do { } while (0)
+
+#endif
+
+#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index f4e516e..468b874 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -301,3 +301,37 @@ config NOMMU_INITIAL_TRIM_EXCESS
 	  of 1 says that all excess pages should be trimmed.
 
 	  See Documentation/nommu-mmap.txt for more information.
+
+
+config CMA
+	bool "Contiguous Memory Allocator framework"
+	# Currently there is only one allocator so force it on
+	select CMA_BEST_FIT
+	help
+	  This enables the Contiguous Memory Allocator framework which
+	  allows drivers to allocate big physically-contiguous blocks of
+	  memory for use with hardware components that do not support I/O
+	  map nor scatter-gather.
+
+	  If you select this option you will also have to select at least
+	  one allocator algorithm below.
+
+	  To make use of CMA you need to specify the regions and
+	  driver->region mapping on command line when booting the kernel.
+
+config CMA_DEBUG
+	bool "CMA debug messages"
+	depends on CMA
+	help
+	  Enable debug messages in CMA code.
+
+config CMA_BEST_FIT
+	bool "CMA best-fit allocator"
+	depends on CMA
+	default y
+	help
+	  This is a best-fit algorithm running in O(n log n) time where
+	  n is the number of existing holes (which is never greater then
+	  the number of allocated regions and usually much smaller).  It
+	  allocates area from the smallest hole that is big enough for
+	  allocation in question.
diff --git a/mm/Makefile b/mm/Makefile
index 34b2546..54b0e99 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -47,3 +47,6 @@ obj-$(CONFIG_MEMORY_FAILURE) += memory-failure.o
 obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o
 obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
 obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
+
+obj-$(CONFIG_CMA) += cma.o
+obj-$(CONFIG_CMA_BEST_FIT) += cma-best-fit.o
diff --git a/mm/cma-best-fit.c b/mm/cma-best-fit.c
new file mode 100644
index 0000000..59515f9
--- /dev/null
+++ b/mm/cma-best-fit.c
@@ -0,0 +1,407 @@
+/*
+ * Contiguous Memory Allocator framework: Best Fit allocator
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+#define pr_fmt(fmt) "cma: bf: " fmt
+
+#ifdef CONFIG_CMA_DEBUG
+#  define DEBUG
+#endif
+
+#include <linux/errno.h>       /* Error numbers */
+#include <linux/slab.h>        /* kmalloc() */
+
+#include <linux/cma.h>         /* CMA structures */
+
+
+/************************* Data Types *************************/
+
+struct cma_bf_item {
+	struct cma_chunk ch;
+	struct rb_node by_size;
+};
+
+struct cma_bf_private {
+	struct rb_root by_start_root;
+	struct rb_root by_size_root;
+};
+
+
+/************************* Prototypes *************************/
+
+/*
+ * Those are only for holes.  They must be called whenever hole's
+ * properties change but also whenever chunk becomes a hole or hole
+ * becames a chunk.
+ */
+static void __cma_bf_hole_insert_by_size(struct cma_bf_item *item);
+static void __cma_bf_hole_erase_by_size(struct cma_bf_item *item);
+static int  __must_check
+            __cma_bf_hole_insert_by_start(struct cma_bf_item *item);
+static void __cma_bf_hole_erase_by_start(struct cma_bf_item *item);
+
+/**
+ * __cma_bf_hole_take - takes a chunk of memory out of a hole.
+ * @hole:	hole to take chunk from
+ * @size:	chunk's size
+ * @alignment:	chunk's starting address alignment (must be power of two)
+ *
+ * Takes a @size bytes large chunk from hole @hole which must be able
+ * to hold the chunk.  The "must be able" includes also alignment
+ * constraint.
+ *
+ * Returns allocated item or NULL on error (if kmalloc() failed).
+ */
+static struct cma_bf_item *__must_check
+__cma_bf_hole_take(struct cma_bf_item *hole, size_t size, dma_addr_t alignment);
+
+/**
+ * __cma_bf_hole_merge_maybe - tries to merge hole with neighbours.
+ * @item: hole to try and merge
+ *
+ * Which items are preserved is undefined so you may not rely on it.
+ */
+static void __cma_bf_hole_merge_maybe(struct cma_bf_item *item);
+
+
+/************************* Device API *************************/
+
+int cma_bf_init(struct cma_region *reg)
+{
+	struct cma_bf_private *prv;
+	struct cma_bf_item *item;
+
+	prv = kzalloc(sizeof *prv, GFP_KERNEL);
+	if (unlikely(!prv))
+		return -ENOMEM;
+
+	item = kzalloc(sizeof *item, GFP_KERNEL);
+	if (unlikely(!item)) {
+		kfree(prv);
+		return -ENOMEM;
+	}
+
+	item->ch.start = reg->start;
+	item->ch.size  = reg->size;
+	item->ch.reg   = reg;
+
+	rb_root_init(&prv->by_start_root, &item->ch.by_start);
+	rb_root_init(&prv->by_size_root, &item->by_size);
+
+	reg->private_data = prv;
+	return 0;
+}
+
+void cma_bf_cleanup(struct cma_region *reg)
+{
+	struct cma_bf_private *prv = reg->private_data;
+	struct cma_bf_item *item =
+		rb_entry(prv->by_size_root.rb_node,
+			 struct cma_bf_item, by_size);
+
+	/* We can assume there is only a single hole in the tree. */
+	WARN_ON(item->by_size.rb_left || item->by_size.rb_right ||
+		item->ch.by_start.rb_left || item->ch.by_start.rb_right);
+
+	kfree(item);
+	kfree(prv);
+}
+
+struct cma_chunk *cma_bf_alloc(struct cma_region *reg,
+			       size_t size, dma_addr_t alignment)
+{
+	struct cma_bf_private *prv = reg->private_data;
+	struct rb_node *node = prv->by_size_root.rb_node;
+	struct cma_bf_item *item = NULL;
+
+	/* First find hole that is large enough */
+	while (node) {
+		struct cma_bf_item *i =
+			rb_entry(node, struct cma_bf_item, by_size);
+
+		if (i->ch.size < size) {
+			node = node->rb_right;
+		} else if (i->ch.size >= size) {
+			node = node->rb_left;
+			item = i;
+		}
+	}
+	if (!item)
+		return NULL;
+
+	/* Now look for items which can satisfy alignment requirements */
+	for (;;) {
+		dma_addr_t start = ALIGN(item->ch.start, alignment);
+		dma_addr_t end   = item->ch.start + item->ch.size;
+		if (start < end && end - start >= size) {
+			item = __cma_bf_hole_take(item, size, alignment);
+			return likely(item) ? &item->ch : NULL;
+		}
+
+		node = rb_next(node);
+		if (!node)
+			return NULL;
+
+		item  = rb_entry(node, struct cma_bf_item, by_size);
+	}
+}
+
+void cma_bf_free(struct cma_chunk *chunk)
+{
+	struct cma_bf_item *item = container_of(chunk, struct cma_bf_item, ch);
+
+	/* Add new hole */
+	if (unlikely(__cma_bf_hole_insert_by_start(item))) {
+		/*
+		 * We're screwed...  Just free the item and forget
+		 * about it.  Things are broken beyond repair so no
+		 * sense in trying to recover.
+		 */
+		kfree(item);
+	} else {
+		__cma_bf_hole_insert_by_size(item);
+
+		/* Merge with prev and next sibling */
+		__cma_bf_hole_merge_maybe(item);
+	}
+}
+
+
+/************************* Basic Tree Manipulation *************************/
+
+static void __cma_bf_hole_insert_by_size(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	struct rb_node **link = &prv->by_size_root.rb_node, *parent = NULL;
+	const typeof(item->ch.size) value = item->ch.size;
+
+	while (*link) {
+		struct cma_bf_item *i;
+		parent = *link;
+		i = rb_entry(parent, struct cma_bf_item, by_size);
+		link = value <= i->ch.size
+			? &parent->rb_left
+			: &parent->rb_right;
+	}
+
+	rb_link_node(&item->by_size, parent, link);
+	rb_insert_color(&item->by_size, &prv->by_size_root);
+}
+
+static void __cma_bf_hole_erase_by_size(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	rb_erase(&item->by_size, &prv->by_size_root);
+}
+
+static int  __must_check
+            __cma_bf_hole_insert_by_start(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	struct rb_node **link = &prv->by_start_root.rb_node, *parent = NULL;
+	const typeof(item->ch.start) value = item->ch.start;
+
+	while (*link) {
+		struct cma_bf_item *i;
+		parent = *link;
+		i = rb_entry(parent, struct cma_bf_item, ch.by_start);
+
+		if (WARN_ON(value == i->ch.start))
+			/*
+			 * This should *never* happen.  And I mean
+			 * *never*.  We could even BUG on it but
+			 * hopefully things are only a bit broken,
+			 * ie. system can still run.  We produce
+			 * a warning and return an error.
+			 */
+			return -EBUSY;
+
+		link = value <= i->ch.start
+			? &parent->rb_left
+			: &parent->rb_right;
+	}
+
+	rb_link_node(&item->ch.by_start, parent, link);
+	rb_insert_color(&item->ch.by_start, &prv->by_start_root);
+	return 0;
+}
+
+static void __cma_bf_hole_erase_by_start(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	rb_erase(&item->ch.by_start, &prv->by_start_root);
+}
+
+
+/************************* More Tree Manipulation *************************/
+
+static struct cma_bf_item *__must_check
+__cma_bf_hole_take(struct cma_bf_item *hole, size_t size, size_t alignment)
+{
+	struct cma_bf_item *item;
+
+	/*
+	 * There are three cases:
+	 * 1. the chunk takes the whole hole,
+	 * 2. the chunk is at the beginning or at the end of the hole, or
+	 * 3. the chunk is in the middle of the hole.
+	 */
+
+
+	/* Case 1, the whole hole */
+	if (size == hole->ch.size) {
+		__cma_bf_hole_erase_by_size(hole);
+		__cma_bf_hole_erase_by_start(hole);
+		return hole;
+	}
+
+
+	/* Allocate */
+	item = kmalloc(sizeof *item, GFP_KERNEL);
+	if (unlikely(!item))
+		return NULL;
+
+	item->ch.start = ALIGN(hole->ch.start, alignment);
+	item->ch.size  = size;
+
+	/* Case 3, in the middle */
+	if (item->ch.start != hole->ch.start
+	 && item->ch.start + item->ch.size !=
+	    hole->ch.start + hole->ch.size) {
+		struct cma_bf_item *tail;
+
+		/*
+		 * Space between the end of the chunk and the end of
+		 * the region, ie. space left after the end of the
+		 * chunk.  If this is dividable by alignment we can
+		 * move the chunk to the end of the hole.
+		 */
+		size_t left =
+			hole->ch.start + hole->ch.size -
+			(item->ch.start + item->ch.size);
+		if (left % alignment == 0) {
+			item->ch.start += left;
+			goto case_2;
+		}
+
+		/*
+		 * We are going to add a hole at the end.  This way,
+		 * we will reduce the problem to case 2 -- the chunk
+		 * will be at the end of the hole.
+		 */
+		tail = kmalloc(sizeof *tail, GFP_KERNEL);
+		if (unlikely(!tail)) {
+			kfree(item);
+			return NULL;
+		}
+
+		tail->ch.start = item->ch.start + item->ch.size;
+		tail->ch.size  =
+			hole->ch.start + hole->ch.size - tail->ch.start;
+		tail->ch.reg   = hole->ch.reg;
+
+		if (unlikely(__cma_bf_hole_insert_by_start(tail))) {
+			/*
+			 * Things are broken beyond repair...  Abort
+			 * inserting the hole but still continue with
+			 * allocation (seems like the best we can do).
+			 */
+
+			hole->ch.size = tail->ch.start - hole->ch.start;
+			kfree(tail);
+		} else {
+			__cma_bf_hole_insert_by_size(tail);
+			/*
+			 * It's important that we first insert the new
+			 * hole in the tree sorted by size and later
+			 * reduce the size of the old hole.  We will
+			 * update the position of the old hole in the
+			 * rb tree in code that handles case 2.
+			 */
+			hole->ch.size = tail->ch.start - hole->ch.start;
+		}
+
+		/* Go to case 2 */
+	}
+
+
+	/* Case 2, at the beginning or at the end */
+case_2:
+	/* No need to update the tree; order preserved. */
+	if (item->ch.start == hole->ch.start)
+		hole->ch.start += item->ch.size;
+
+	/* Alter hole's size */
+	hole->ch.size -= size;
+	__cma_bf_hole_erase_by_size(hole);
+	__cma_bf_hole_insert_by_size(hole);
+
+	return item;
+}
+
+
+static void __cma_bf_hole_merge_maybe(struct cma_bf_item *item)
+{
+	struct cma_bf_item *prev;
+	struct rb_node *node;
+	int twice = 2;
+
+	node = rb_prev(&item->ch.by_start);
+	if (unlikely(!node))
+		goto next;
+	prev = rb_entry(node, struct cma_bf_item, ch.by_start);
+
+	for (;;) {
+		if (prev->ch.start + prev->ch.size == item->ch.start) {
+			/* Remove previous hole from trees */
+			__cma_bf_hole_erase_by_size(prev);
+			__cma_bf_hole_erase_by_start(prev);
+
+			/* Alter this hole */
+			item->ch.size += prev->ch.size;
+			item->ch.start = prev->ch.start;
+			__cma_bf_hole_erase_by_size(item);
+			__cma_bf_hole_insert_by_size(item);
+			/*
+			 * No need to update by start trees as we do
+			 * not break sequence order
+			 */
+
+			/* Free prev hole */
+			kfree(prev);
+		}
+
+next:
+		if (!--twice)
+			break;
+
+		node = rb_next(&item->ch.by_start);
+		if (unlikely(!node))
+			break;
+		prev = item;
+		item = rb_entry(node, struct cma_bf_item, ch.by_start);
+	}
+}
+
+
+
+/************************* Register *************************/
+static int cma_bf_module_init(void)
+{
+	static struct cma_allocator alloc = {
+		.name    = "bf",
+		.init    = cma_bf_init,
+		.cleanup = cma_bf_cleanup,
+		.alloc   = cma_bf_alloc,
+		.free    = cma_bf_free,
+	};
+	return cma_allocator_register(&alloc);
+}
+module_init(cma_bf_module_init);
diff --git a/mm/cma.c b/mm/cma.c
new file mode 100644
index 0000000..0b02ab0
--- /dev/null
+++ b/mm/cma.c
@@ -0,0 +1,1170 @@
+/*
+ * Contiguous Memory Allocator framework
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/contiguous-memory.txt for details.
+ */
+
+#define pr_fmt(fmt) "cma: " fmt
+
+#ifdef CONFIG_CMA_DEBUG
+#  define DEBUG
+#endif
+
+#ifndef CONFIG_NO_BOOTMEM
+#  include <linux/bootmem.h>   /* alloc_bootmem_pages_nopanic() */
+#endif
+#ifdef CONFIG_HAVE_MEMBLOCK
+#  include <linux/memblock.h>  /* memblock*() */
+#endif
+#include <linux/device.h>      /* struct device, dev_name() */
+#include <linux/errno.h>       /* Error numbers */
+#include <linux/err.h>         /* IS_ERR, PTR_ERR, etc. */
+#include <linux/mm.h>          /* PAGE_ALIGN() */
+#include <linux/module.h>      /* EXPORT_SYMBOL_GPL() */
+#include <linux/mutex.h>       /* mutex */
+#include <linux/slab.h>        /* kmalloc() */
+#include <linux/string.h>      /* str*() */
+
+#include <linux/cma.h>
+
+
+/*
+ * Protects cma_regions, cma_allocators, cma_attr_map (which means
+ * cma_map and cma_map_len), cma_attr_asterisk (which means
+ * cma_asterisk and cma_asterisk_len) and cma_chunks_by_start.
+ */
+static DEFINE_MUTEX(cma_mutex);
+
+static struct kobject *cma_kobj;
+
+
+/************************* Attributes *************************/
+
+struct cma_attribute {
+	struct kobj_attribute kattr;
+	const char *value;
+	size_t length;
+	ssize_t (*validate)(const char *value);
+};
+
+#define __CMA_ATTR(_name) \
+	__ATTR(_name, 0644, cma_attr_show, cma_attr_store)
+
+static ssize_t cma_attr_show(struct kobject *kobj,
+			     struct kobj_attribute *_attr, char *page)
+{
+	struct cma_attribute *attr =
+		container_of(_attr, struct cma_attribute, kattr);
+	ssize_t len;
+
+	mutex_lock(&cma_mutex);
+
+	if (!attr->value || !attr->length) {
+		*page = 0;
+		len = 0;
+	} else {
+		len = min(attr->length, (size_t)PAGE_SIZE - 2);
+		memcpy(page, attr->value, len);
+		page[len]   = '\n';
+		page[++len] = '\0';
+	}
+
+	mutex_unlock(&cma_mutex);
+
+	return len;
+}
+
+static ssize_t cma_attr_store(struct kobject *kobj,
+			      struct kobj_attribute *_attr,
+			      const char *page, size_t count)
+{
+	struct cma_attribute *attr =
+		container_of(_attr, struct cma_attribute, kattr);
+	ssize_t len = attr->validate(page);
+	char *val = NULL;
+
+	if (len < 0)
+		return len;
+
+	if (len) {
+		val = kmemdup(page, len + 1, GFP_KERNEL);
+		if (!val)
+			return -ENOMEM;
+		val[len] = '\0';
+	}
+
+	mutex_lock(&cma_mutex);
+
+	kfree(attr->value);
+	attr->value = val;
+	attr->length = len;
+
+	mutex_unlock(&cma_mutex);
+
+	return count;
+}
+
+
+/*
+ * param        ::= [ rules [ ';' ] ]
+ * rules        ::= rule [ ';' rules ]
+ * rule         ::= patterns '=' regions
+ * patterns     ::= pattern [ ',' patterns ]
+ *
+ * regions      ::= reg-name [ ',' regions ] | '*'
+ *              // list of regions to try to allocate memory
+ *              // from for devices that match pattern
+ *
+ * pattern      ::= dev-pattern [ '/' kind-pattern ]
+ *                | '/' kind-pattern
+ *              // pattern request must match for this rule to
+ *              // apply to it; the first rule that matches is
+ *              // applied; if dev-pattern part is omitted
+ *              // value identical to the one used in previous
+ *              // rule is assumed
+ *
+ * See Documentation/contiguous-memory.txt for details.
+ *
+ * Example (white space added for convenience, forbidden in real string):
+ * cma_map = foo-dev = reg1;             -- foo-dev with no kind
+ *           bar-dev / firmware = reg3;  -- bar-dev's firmware
+ *           / * = reg2;                 -- bar-dev's all other kinds
+ *           baz-dev / * = reg1,reg2;    -- any kind of baz-dev
+ *           * / * = reg2,reg1;          -- any other allocations
+ */
+static ssize_t cma_map_validate(const char *param)
+{
+	const char *ch = param;
+
+	if (*ch == '\0' || *ch == '\n')
+		return 0;
+
+	for (;;) {
+		const char *start = ch;
+
+		while (*ch && *ch != '\n' && *ch != ';' && *ch != '=')
+			++ch;
+
+		if (*ch != '=' || start == ch) {
+			pr_err("map: expecting \"<patterns>=<regions>\" near %s\n", start);
+			return -EINVAL;
+		}
+
+		if (*ch == '*' && (ch[1] && ch[1] != '\n' && ch[1] != ';')) {
+			pr_err("map: end of ';' expecting after '*' near %s\n", start);
+			return -EINVAL;
+		}
+
+		while (*++ch != ';')
+			if (!*ch || *ch == '\n')
+				return ch - param;
+		if (ch[1] == '\0' || ch[1] == '\n')
+			return ch - param;
+		++ch;
+	}
+}
+
+
+static struct cma_attribute cma_attributes[] = {
+	{
+		.kattr		= __CMA_ATTR(map),
+		.validate	= &cma_map_validate
+	},
+	{
+		.kattr		= __CMA_ATTR(asterisk),
+		.validate	= (ssize_t (*)(const char *))&strlen,
+	},
+};
+
+#define cma_map       (cma_attributes[0].value)
+#define cma_asterisk  (cma_attributes[1].value)
+
+
+static int __init cma_attributes_init(void)
+{
+	struct cma_attribute *attr = cma_attributes;
+	int i = ARRAY_SIZE(cma_attributes), ret = 0;
+
+	do {
+		if (attr->value) {
+			/* Apply defaults.  Ignore parsing errors. */
+			const char *v = attr->value;
+			attr->value = NULL;
+			cma_attr_store(NULL, &attr->kattr, v, 0);
+		}
+
+		/*
+		 * Even if there was an error creating one file
+		 * initialise the rest.
+		 */
+		if (cma_kobj && !ret)
+			ret = sysfs_create_file(cma_kobj, &attr->kattr.attr);
+	} while (++attr, --i);
+
+	return ret;
+}
+
+
+
+/************************* Early parameter *************************/
+
+struct list_head cma_early_regions __initdata =
+	LIST_HEAD_INIT(cma_early_regions);
+
+/*
+ * regions-attr ::= [ regions [ ';' ] ]
+ * regions      ::= region [ ';' regions ]
+ *
+ * region       ::= [ '-' ] reg-name
+ *                    '=' size
+ *                  [ '@' start ]
+ *                  [ '/' alignment ]
+ *                  [ ':' alloc-name ]
+ *
+ * See Documentation/contiguous-memory.txt for details.
+ *
+ * Example:
+ * cma=reg1=64M:bf;reg2=32M@0x100000:bf;reg3=64M/1M:bf
+ *
+ * If allocator is ommited the first available allocater will be used.
+ */
+
+#define NUMPARSE(cond_ch, type, cond) ({				\
+		unsigned long long v = 0;				\
+		if (*param == (cond_ch)) {				\
+			const char *const msg = param + 1;		\
+			v = memparse(msg, &param);			\
+			if (!v || v > ~(type)0 || !(cond)) {		\
+				pr_err("param: invalid value near %s\n", msg); \
+				ret = -EINVAL;				\
+				break;					\
+			}						\
+		}							\
+		v;							\
+	})
+
+static int __init cma_param_parse(char *param)
+{
+	static int already_parsed __initdata;
+
+	static struct cma_region regions[16];
+	static char buffer[128] __initdata;
+
+	size_t left = ARRAY_SIZE(regions);
+	struct cma_region *reg = regions;
+
+	size_t buf_left = ARRAY_SIZE(buffer);
+	char *buf = buffer;
+
+	int ret = 0;
+
+	pr_debug("param: %s\n", param);
+
+	if (already_parsed)
+		return 0;
+	already_parsed = 1;
+
+	for (; *param; ++reg) {
+		dma_addr_t start, alignment;
+		size_t size;
+		int minus;
+
+		if (unlikely(!--left)) {
+			pr_err("param: too many early regions\n");
+			return -ENOSPC;
+		}
+
+		/* Parse name */
+		minus = *param == '-';
+		if (minus)
+			++param;
+
+		reg->name = param;
+		param = strchr(param, '=');
+		if (!param || param == reg->name) {
+			pr_err("param: expected \"<name>=\" near %s\n",
+			       reg->name);
+			ret = -EINVAL;
+			break;
+		}
+		*param = '\0';
+
+		/* Copy name to buffer */
+		if (!minus) {
+			size_t len = param - reg->name;
+			if (len >= buf_left) {
+				pr_err("param: names too long\n");
+				ret = -ENOSPC;
+				break;
+			}
+
+			memcpy(buf, reg->name, len);
+			buf += len;
+			*buf = ',';
+			++buf;
+			buf_left -= len;
+		}
+
+		/* Parse numbers */
+		size      = NUMPARSE('\0', size_t, true);
+		start     = NUMPARSE('@', dma_addr_t, true);
+		alignment = NUMPARSE('/', dma_addr_t, (v & (v - 1)) == 0);
+
+		alignment = max(alignment, (dma_addr_t)PAGE_SIZE);
+		start     = ALIGN(start, alignment);
+		size      = PAGE_ALIGN(size);
+		if (start + size < start) {
+			pr_err("param: invalid start, size combination\n");
+			ret = -EINVAL;
+			break;
+		}
+
+		/* Parse allocator */
+		if (*param == ':') {
+			reg->alloc_name = ++param;
+			while (*param && *param != ';')
+				++param;
+			if (param == reg->alloc_name)
+				reg->alloc_name = NULL;
+		}
+
+		/* Go to next */
+		if (*param == ';') {
+			*param = '\0';
+			++param;
+		} else if (*param) {
+			pr_err("param: expecting ';' or end of parameter near %s\n",
+			       param);
+			ret = -EINVAL;
+			break;
+		}
+
+		/* Add */
+		reg->size      = size;
+		reg->start     = start;
+		reg->alignment = alignment;
+		reg->copy_name = 1;
+
+		list_add_tail(&reg->list, &cma_early_regions);
+
+		pr_debug("param: registering early region %s (%p@%p/%p)\n",
+			 reg->name, (void *)reg->size, (void *)reg->start,
+			 (void *)reg->alignment);
+	}
+
+	if (buf != buffer) {
+		buf[-1] = '\0';
+		cma_asterisk = buffer;
+	}
+	return ret;
+}
+early_param("cma", cma_param_parse);
+
+#undef NUMPARSE
+
+
+int __init __must_check cma_early_region_register(struct cma_region *reg)
+{
+	dma_addr_t start, alignment;
+	size_t size;
+
+	if (reg->alignment & (reg->alignment - 1))
+		return -EINVAL;
+
+	alignment = max(reg->alignment, (dma_addr_t)PAGE_SIZE);
+	start     = ALIGN(reg->start, alignment);
+	size      = PAGE_ALIGN(reg->size);
+
+	if (start + size < start)
+		return -EINVAL;
+
+	reg->size      = size;
+	reg->start     = start;
+	reg->alignment = alignment;
+
+	list_add_tail(&reg->list, &cma_early_regions);
+
+	pr_debug("param: registering early region %s (%p@%p/%p)\n",
+		 reg->name, (void *)reg->size, (void *)reg->start,
+		 (void *)reg->alignment);
+
+	return 0;
+}
+
+
+
+/************************* Regions & Allocators *************************/
+
+static int __cma_region_attach_alloc(struct cma_region *reg);
+
+
+/* List of all regions.  Named regions are kept before unnamed. */
+static LIST_HEAD(cma_regions);
+
+#define cma_foreach_region(reg) \
+	list_for_each_entry(reg, &cma_regions, list)
+
+int __must_check cma_region_register(struct cma_region *reg)
+{
+	const char *name, *alloc_name;
+	struct cma_region *r;
+	char *ch = NULL;
+	int ret = 0;
+
+	if (!reg->size || reg->start + reg->size < reg->start)
+		return -EINVAL;
+
+	/* reg->users = 0; */
+	reg->used = 0;
+	reg->private_data = NULL;
+	reg->registered = 0;
+	reg->free_space = reg->size;
+
+	/* Copy name and alloc_name */
+	name = reg->name;
+	alloc_name = reg->alloc_name;
+	if (reg->copy_name && (reg->name || reg->alloc_name)) {
+		size_t name_size, alloc_size;
+
+		name_size  = reg->name       ? strlen(reg->name) + 1       : 0;
+		alloc_size = reg->alloc_name ? strlen(reg->alloc_name) + 1 : 0;
+
+		ch = kmalloc(name_size + alloc_size, GFP_KERNEL);
+		if (!ch) {
+			pr_err("%s: not enough memory to allocate name\n",
+			       reg->name ?: "(private)");
+			return -ENOMEM;
+		}
+
+		if (name_size) {
+			memcpy(ch, reg->name, name_size);
+			name = ch;
+			ch += name_size;
+		}
+
+		if (alloc_size) {
+			memcpy(ch, reg->alloc_name, alloc_size);
+			alloc_name = ch;
+		}
+	}
+
+	mutex_lock(&cma_mutex);
+
+	/* Don't let regions overlap */
+	cma_foreach_region(r)
+		if (r->start + r->size > reg->start &&
+		    r->start < reg->start + reg->size) {
+			ret = -EADDRINUSE;
+			goto done;
+		}
+
+	if (reg->alloc) {
+		ret = __cma_region_attach_alloc(reg);
+		if (unlikely(ret < 0))
+			goto done;
+	}
+
+	reg->name = name;
+	reg->alloc_name = alloc_name;
+	reg->registered = 1;
+	ch = NULL;
+
+	/*
+	 * Keep named at the beginning and unnamed (private) at the
+	 * end.  This helps in traversal when named region is looked
+	 * for.
+	 */
+	if (name)
+		list_add(&reg->list, &cma_regions);
+	else
+		list_add_tail(&reg->list, &cma_regions);
+
+done:
+	mutex_unlock(&cma_mutex);
+
+	pr_debug("%s: region %sregistered\n",
+		 reg->name ?: "(private)", ret ? "not " : "");
+	if (ch)
+		kfree(ch);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(cma_region_register);
+
+static struct cma_region *__must_check
+__cma_region_find(const char **namep)
+{
+	struct cma_region *reg;
+	const char *ch, *name;
+	size_t n;
+
+	for (ch = *namep; *ch && *ch != ',' && *ch != ';'; ++ch)
+		/* nop */;
+	name = *namep;
+	*namep = *ch ? ch : (ch + 1);
+	n = ch - name;
+
+	/*
+	 * Named regions are kept in front of unnamed so if we
+	 * encounter unnamed region we can stop.
+	 */
+	cma_foreach_region(reg)
+		if (!reg->name)
+			break;
+		else if (!strncmp(name, reg->name, n) && !reg->name[n])
+			return reg;
+
+	return NULL;
+}
+
+
+/* List of all allocators. */
+static LIST_HEAD(cma_allocators);
+
+#define cma_foreach_allocator(alloc) \
+	list_for_each_entry(alloc, &cma_allocators, list)
+
+int cma_allocator_register(struct cma_allocator *alloc)
+{
+	struct cma_region *reg;
+	int first;
+
+	if (!alloc->alloc || !alloc->free)
+		return -EINVAL;
+
+	/* alloc->users = 0; */
+
+	mutex_lock(&cma_mutex);
+
+	first = list_empty(&cma_allocators);
+
+	list_add_tail(&alloc->list, &cma_allocators);
+
+	/*
+	 * Attach this allocator to all allocator-less regions that
+	 * request this particular allocator (reg->alloc_name equals
+	 * alloc->name) or if region wants the first available
+	 * allocator and we are the first.
+	 */
+	cma_foreach_region(reg) {
+		if (reg->alloc)
+			continue;
+		if (reg->alloc_name
+		  ? alloc->name && !strcmp(alloc->name, reg->alloc_name)
+		  : (!reg->used && first))
+			continue;
+
+		reg->alloc = alloc;
+		__cma_region_attach_alloc(reg);
+	}
+
+	mutex_unlock(&cma_mutex);
+
+	pr_debug("%s: allocator registered\n", alloc->name ?: "(unnamed)");
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cma_allocator_register);
+
+static struct cma_allocator *__must_check
+__cma_allocator_find(const char *name)
+{
+	struct cma_allocator *alloc;
+
+	if (!name)
+		return list_empty(&cma_allocators)
+			? NULL
+			: list_entry(cma_allocators.next,
+				     struct cma_allocator, list);
+
+	cma_foreach_allocator(alloc)
+		if (alloc->name && !strcmp(name, alloc->name))
+			return alloc;
+
+	return NULL;
+}
+
+
+
+/************************* Initialise CMA *************************/
+
+int __init cma_set_defaults(char *regions, const char *map,
+                            const char *asterisk)
+{
+	if (map)
+		cma_map = map;
+
+	if (asterisk)
+		cma_asterisk = asterisk;
+
+	if (regions)
+		return cma_param_parse(regions);
+	else
+		return 0;
+}
+
+
+int __init cma_early_region_reserve(struct cma_region *reg)
+{
+	int tried = 0;
+
+	if (!reg->size || (reg->alignment & (reg->alignment - 1)) ||
+	    reg->reserved)
+		return -EINVAL;
+
+#ifndef CONFIG_NO_BOOTMEM
+
+	tried = 1;
+
+	{
+		void *ptr;
+
+		ptr = __alloc_bootmem_nopanic(reg->size, reg->alignment,
+					      reg->start);
+		if (ptr) {
+			reg->start = virt_to_phys(ptr);
+			reg->reserved = 1;
+			return 0;
+		}
+	}
+
+#endif
+
+#ifdef CONFIG_HAVE_MEMBLOCK
+
+	tried = 1;
+
+	if (reg->start) {
+		if (memblock_is_region_reserved(reg->start, reg->size) < 0 &&
+		    memblock_reserve(reg->start, reg->size) >= 0) {
+			reg->reserved = 1;
+			return 0;
+		}
+	} else {
+		/*
+		 * Use __memblock_alloc_base() since
+		 * memblock_alloc_base() panic()s.
+		 */
+		u64 ret = __memblock_alloc_base(reg->size, reg->alignment, 0);
+		if (ret &&
+		    ret < ~(dma_addr_t)0 &&
+		    ret + reg->size < ~(dma_addr_t)0 &&
+		    ret + reg->size > ret) {
+			reg->start = ret;
+			reg->reserved = 1;
+			return 0;
+		}
+
+		if (ret)
+			memblock_free(ret, reg->size);
+	}
+
+#endif
+
+	return tried ? -ENOMEM : -EOPNOTSUPP;
+}
+
+void __init cma_early_regions_reserve(int (*reserve)(struct cma_region *reg))
+{
+	struct cma_region *reg;
+
+	pr_debug("init: reserving early regions\n");
+
+	if (!reserve)
+		reserve = cma_early_region_reserve;
+
+	list_for_each_entry(reg, &cma_early_regions, list) {
+		if (reg->reserved) {
+			/* nothing */
+		} else if (reserve(reg) >= 0) {
+			pr_debug("init: %s: reserved %p@%p\n",
+				 reg->name ?: "(private)",
+				 (void *)reg->size, (void *)reg->start);
+			reg->reserved = 1;
+		} else {
+			pr_warn("init: %s: unable to reserve %p@%p/%p\n",
+				reg->name ?: "(private)",
+				(void *)reg->size, (void *)reg->start,
+				(void *)reg->alignment);
+		}
+	}
+}
+
+
+static int __init cma_init(void)
+{
+	struct cma_region *reg, *n;
+
+	pr_debug("init: initialising\n");
+
+	cma_kobj = kobject_create_and_add("contiguous", mm_kobj);
+	if (unlikely(!cma_kobj))
+		pr_err("init: unable to add sysfs directory\n");
+
+	cma_attributes_init();
+
+	list_for_each_entry_safe(reg, n, &cma_early_regions, list) {
+		INIT_LIST_HEAD(&reg->list);
+		/*
+		 * We don't care if there was an error.  It's a pity
+		 * but there's not much we can do about it any way.
+		 * If the error is on a region that was parsed from
+		 * command line then it will stay and waste a bit of
+		 * space; if it was registered using
+		 * cma_early_region_register() it's caller's
+		 * responsibility to do something about it.
+		 */
+		if (reg->reserved && cma_region_register(reg) < 0)
+			/* ignore error */;
+	}
+
+	INIT_LIST_HEAD(&cma_early_regions);
+
+	return 0;
+}
+__initcall(cma_init);
+
+
+
+/************************* Chunks *************************/
+
+/* All chunks sorted by start address. */
+static struct rb_root cma_chunks_by_start;
+
+static struct cma_chunk *__must_check __cma_chunk_find(dma_addr_t addr)
+{
+	struct cma_chunk *chunk;
+	struct rb_node *n;
+
+	for (n = cma_chunks_by_start.rb_node; n; ) {
+		chunk = rb_entry(n, struct cma_chunk, by_start);
+		if (addr < chunk->start)
+			n = n->rb_left;
+		else if (addr > chunk->start)
+			n = n->rb_right;
+		else
+			return chunk;
+	}
+	WARN("no chunk starting at %p\n", (void *)addr);
+	return NULL;
+}
+
+static int __must_check __cma_chunk_insert(struct cma_chunk *chunk)
+{
+	struct rb_node **new, *parent = NULL;
+	typeof(chunk->start) addr = chunk->start;
+
+	for (new = &cma_chunks_by_start.rb_node; *new; ) {
+		struct cma_chunk *c =
+			container_of(*new, struct cma_chunk, by_start);
+
+		parent = *new;
+		if (addr < c->start) {
+			new = &(*new)->rb_left;
+		} else if (addr > c->start) {
+			new = &(*new)->rb_right;
+		} else {
+			/*
+			 * We should never be here.  If we are it
+			 * means allocator gave us an invalid chunk
+			 * (one that has already been allocated) so we
+			 * refuse to accept it.  Our caller will
+			 * recover by freeing the chunk.
+			 */
+			WARN_ON(1);
+			return -EADDRINUSE;
+		}
+	}
+
+	rb_link_node(&chunk->by_start, parent, new);
+	rb_insert_color(&chunk->by_start, &cma_chunks_by_start);
+
+	return 0;
+}
+
+static void __cma_chunk_release(struct kref *ref)
+{
+	struct cma_chunk *chunk = container_of(ref, struct cma_chunk, ref);
+
+	rb_erase(&chunk->by_start, &cma_chunks_by_start);
+
+	chunk->reg->alloc->free(chunk);
+	/* --chunk->reg->users; */
+	chunk->reg->free_space += chunk->size;
+}
+
+
+
+/************************* The Device API *************************/
+
+static const char *__must_check
+__cma_where_from(const struct device *dev, const char *kind);
+
+
+/* Allocate. */
+
+static dma_addr_t __must_check
+__cma_alloc_from_region(struct cma_region *reg,
+			size_t size, dma_addr_t alignment)
+{
+	struct cma_chunk *chunk;
+
+	if (!reg || reg->free_space < size)
+		return -ENOMEM;
+
+	if (!reg->alloc) {
+		if (!reg->used)
+			__cma_region_attach_alloc(reg);
+		if (!reg->alloc)
+			return -ENOMEM;
+	}
+
+	chunk = reg->alloc->alloc(reg, size, alignment);
+	if (!chunk)
+		return -ENOMEM;
+
+	if (unlikely(__cma_chunk_insert(chunk) < 0)) {
+		/* We should *never* be here. */
+		chunk->reg->alloc->free(chunk);
+		kfree(chunk);
+		return -EADDRINUSE;
+	}
+
+	chunk->reg = reg;
+	kref_init(&chunk->ref);
+	/* ++reg->users; */
+	reg->free_space -= chunk->size;
+	pr_debug("allocated at %p\n", (void *)chunk->start);
+	return chunk->start;
+}
+
+dma_addr_t __must_check
+cma_alloc_from_region(struct cma_region *reg,
+		      size_t size, dma_addr_t alignment)
+{
+	dma_addr_t addr;
+
+	mutex_lock(&cma_mutex);
+
+	pr_debug("allocate %p/%p from %s\n",
+		 (void *)size, (void *)alignment,
+		 reg ? reg->name ?: "(private)" : "(null)");
+
+	if (!size || alignment & (alignment - 1) || !reg->registered)
+		addr = -EINVAL;
+	else
+		addr = __cma_alloc_from_region(reg, PAGE_ALIGN(size),
+					       max(alignment, (dma_addr_t)PAGE_SIZE));
+
+	mutex_unlock(&cma_mutex);
+
+	return addr;
+}
+EXPORT_SYMBOL_GPL(cma_alloc_from_region);
+
+dma_addr_t __must_check
+__cma_alloc(const struct device *dev, const char *kind,
+	    dma_addr_t size, dma_addr_t alignment)
+{
+	struct cma_region *reg;
+	const char *from;
+	dma_addr_t addr;
+
+	if (dev)
+		pr_debug("allocate %p/%p for %s/%s\n",
+			 (void *)size, (void *)alignment,
+			 dev_name(dev), kind ?: "");
+
+	if (!size || alignment & (alignment - 1))
+		return -EINVAL;
+
+	mutex_lock(&cma_mutex);
+
+	from = __cma_where_from(dev, kind);
+	if (unlikely(IS_ERR(from))) {
+		addr = PTR_ERR(from);
+		goto done;
+	}
+
+	pr_debug("allocate %p/%p from one of %s\n",
+		 (void *)size, (void *)alignment, from);
+
+	size = PAGE_ALIGN(size);
+	if (alignment < PAGE_SIZE)
+		alignment = PAGE_SIZE;
+
+	while (*from && *from != ';') {
+		reg = __cma_region_find(&from);
+		addr = __cma_alloc_from_region(reg, size, alignment);
+		if (!IS_ERR_VALUE(addr))
+			goto done;
+	}
+
+	pr_debug("not enough memory\n");
+	addr = -ENOMEM;
+
+done:
+	mutex_unlock(&cma_mutex);
+
+	return addr;
+}
+EXPORT_SYMBOL_GPL(__cma_alloc);
+
+
+/* Query information about regions. */
+int
+__cma_info(struct cma_info *infop, const struct device *dev, const char *kind)
+{
+	struct cma_info info = { ~(dma_addr_t)0, 0, 0, 0, 0 };
+	const char *from;
+	int ret;
+
+	if (unlikely(!infop))
+		return -EINVAL;
+
+	mutex_lock(&cma_mutex);
+
+	from = __cma_where_from(dev, kind);
+	if (IS_ERR(from)) {
+		ret = PTR_ERR(from);
+		info.lower_bound = 0;
+		goto done;
+	}
+
+	while (*from && *from != ';') {
+		struct cma_region *reg = __cma_region_find(&from);
+		if (reg) {
+			info.total_size += reg->size;
+			info.free_size += reg->free_space;
+			if (info.lower_bound > reg->start)
+				info.lower_bound = reg->start;
+			if (info.upper_bound < reg->start + reg->size)
+				info.upper_bound = reg->start + reg->size;
+			++info.count;
+		}
+	}
+
+	ret = 0;
+done:
+	mutex_unlock(&cma_mutex);
+	memcpy(infop, &info, sizeof info);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(__cma_info);
+
+
+/* kref handling. */
+
+int cma_get(dma_addr_t addr)
+{
+	struct cma_chunk *c;
+
+	mutex_lock(&cma_mutex);
+	c = __cma_chunk_find(addr);
+	if (c)
+		kref_get(&c->ref);
+	mutex_unlock(&cma_mutex);
+
+	pr_debug("get(%p): %sfound\n", (void *)addr, c ? "" : "not ");
+	return c ? 0 : -ENOENT;
+}
+EXPORT_SYMBOL_GPL(cma_get);
+
+int cma_put(dma_addr_t addr)
+{
+	struct cma_chunk *c;
+	int ret;
+
+	mutex_lock(&cma_mutex);
+	c = __cma_chunk_find(addr);
+	ret = c ? kref_put(&c->ref, __cma_chunk_release) : -ENOENT;
+	mutex_unlock(&cma_mutex);
+
+	pr_debug("put(%p): %sfound%s\n", (void *)addr, c ? "" : "not ",
+		 ret > 0 ? " and destroyed" : "");
+	return ret;
+}
+EXPORT_SYMBOL_GPL(cma_put);
+
+
+/************************* Miscellaneous *************************/
+
+static int __cma_region_attach_alloc(struct cma_region *reg)
+{
+	struct cma_allocator *alloc;
+	int ret;
+
+	/*
+	 * If reg->alloc is set then caller wants us to use this
+	 * allocator.  Otherwise we need to find one by name.
+	 */
+	if (reg->alloc) {
+		alloc = reg->alloc;
+	} else {
+		alloc = __cma_allocator_find(reg->alloc_name);
+		if (!alloc) {
+			pr_warn("init: %s: %s: no such allocator\n",
+				reg->name ?: "(private)",
+				reg->alloc_name ?: "(default)");
+			reg->used = 1;
+			return -ENOENT;
+		}
+	}
+
+	/* Try to initialise the allocator. */
+	reg->private_data = NULL;
+	ret = alloc->init ? alloc->init(reg) : 0;
+	if (unlikely(ret < 0)) {
+		pr_err("init: %s: %s: unable to initialise allocator\n",
+		       reg->name ?: "(private)", alloc->name ?: "(unnamed)");
+		reg->alloc = NULL;
+		reg->used = 1;
+	} else {
+		reg->alloc = alloc;
+		/* ++alloc->users; */
+		pr_debug("init: %s: %s: initialised allocator\n",
+			 reg->name ?: "(private)", alloc->name ?: "(unnamed)");
+	}
+	return ret;
+}
+
+
+/*
+ * s            ::= rules
+ * rules        ::= rule [ ';' rules ]
+ * rule         ::= patterns '=' [ regions ]
+ * patterns     ::= pattern [ ',' patterns ]
+ * pattern      ::= dev-pattern [ '/' kind-pattern ]
+ *                | '/' kind-pattern
+ */
+static const char *__must_check
+__cma_where_from(const struct device *dev, const char *kind)
+{
+	/*
+	 * This function matches the pattern from the map attribute
+	 * agains given device name and kind.  Kind may be of course
+	 * NULL or an emtpy string.
+	 */
+
+	const char *s, *name;
+	int name_matched = 0;
+
+	/*
+	 * If dev is NULL we were called in alternative form where
+	 * kind is the from string.  All we have to do is return it
+	 * unless it's NULL or "*" in which case we return
+	 * cma_asterisk.
+	 */
+	if (!dev) {
+		if (!kind || *kind == '*')
+			return cma_asterisk ?: ERR_PTR(-ENOENT);
+		else
+			return kind;
+	}
+
+	if (!cma_map)
+		return cma_asterisk ?: ERR_PTR(-ENOENT);
+
+	name = dev_name(dev);
+	if (WARN_ON(!name || !*name))
+		return ERR_PTR(-EINVAL);
+
+	if (!kind)
+		kind = "";
+
+	/*
+	 * Now we go throught the cma_map parameter.  It is what has
+	 * been provided by command line.
+	 */
+	for (s = cma_map; *s; ++s) {
+		const char *c;
+
+		/*
+		 * If the pattern starts with a slash, the device part of the
+		 * pattern matches if it matched previously.
+		 */
+		if (*s == '/') {
+			if (!name_matched)
+				goto look_for_next;
+			goto match_kind;
+		}
+
+		/*
+		 * We are now trying to match the device name.  This also
+		 * updates the name_matched variable.  If, while reading the
+		 * spec, we ecnounter comma it means that the pattern does not
+		 * match and we need to start over with another pattern (the
+		 * one afther the comma).  If we encounter equal sign we need
+		 * to start over with another rule.  If there is a character
+		 * that does not match, we neet to look for a comma (to get
+		 * another pattern) or semicolon (to get another rule) and try
+		 * again if there is one semowhere.
+		 */
+
+		name_matched = 0;
+
+		for (c = name; *s != '*' && *c; ++c, ++s)
+			if (*s == '=')
+				goto next_rule;
+			else if (*s == ',')
+				continue;
+			else if (*s != '?' && *c != *s)
+				goto look_for_next;
+		if (*s == '*')
+			++s;
+
+		name_matched = 1;
+
+		/*
+		 * Now we need to match the kind part of the pattern.  If the
+		 * pattern is missing it we match only if kind points to an
+		 * empty string.  Otherwise wy try to match it just like name.
+		 */
+		if (*s != '/') {
+			if (*kind)
+				goto look_for_next;
+		} else {
+match_kind:		/* s points to '/' */
+			++s;
+
+			for (c = kind; *s != '*' && *c; ++c, ++s)
+				if (*s == '=')
+					goto next_rule;
+				else if (*s == ',')
+					continue;
+				else if (*s != '?' && *c != *s)
+					goto look_for_next;
+			if (*s == '*')
+				++s;
+		}
+
+		/* Return the string behind the '=' sign of the rule. */
+		if (*s == '=' || *s == ',') {
+			s = strchr(s, '=') + 1;
+			return *s == '*' ? cma_asterisk ?: ERR_PTR(-ENOENT) : s;
+		}
+
+look_for_next:
+		do {
+			++s;
+		} while (*s != ',' && *s != '=');
+		if (*s == ',')
+			continue;
+
+next_rule:	/* s points to '=' */
+		s = strchr(s, ';');
+		if (!s)
+			break;
+	}
+
+	return ERR_PTR(-ENOENT);
+}
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCHv2 3/4] mm: cma: Test device and application added
  2010-07-26 14:11     ` Michal Nazarewicz
  (?)
@ 2010-07-26 14:11       ` Michal Nazarewicz
  -1 siblings, 0 replies; 57+ messages in thread
From: Michal Nazarewicz @ 2010-07-26 14:11 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-media, linux-arm-kernel, Hiremath Vaibhav,
	Marek Szyprowski, Pawel Osciak, Mark Brown, Daniel Walker,
	Jonathan Corbet, FUJITA Tomonori, Zach Pfeffer, Kyungmin Park,
	Michal Nazarewicz

This patch adds a "cma" misc device which lets user space use the
CMA API.  This device is meant for testing.  A testing application
is also provided.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/misc/Kconfig   |    8 +
 drivers/misc/Makefile  |    1 +
 drivers/misc/cma-dev.c |  184 ++++++++++++++++++++++++
 include/linux/cma.h    |   30 ++++
 tools/cma/cma-test.c   |  373 ++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 596 insertions(+), 0 deletions(-)
 create mode 100644 drivers/misc/cma-dev.c
 create mode 100644 tools/cma/cma-test.c

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 9b089df..6ae3d9f 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -368,4 +368,12 @@ source "drivers/misc/eeprom/Kconfig"
 source "drivers/misc/cb710/Kconfig"
 source "drivers/misc/iwmc3200top/Kconfig"
 
+config CMA_DEVICE
+	tristate "CMA misc device (DEVELOPEMENT)"
+	depends on CMA
+	help
+	  The CMA misc device allows allocating contiguous memory areas
+	  from user space.  This is mostly for testing of the CMA
+	  framework.
+
 endif # MISC_DEVICES
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 67552d6..9921370 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -32,3 +32,4 @@ obj-y				+= eeprom/
 obj-y				+= cb710/
 obj-$(CONFIG_VMWARE_BALLOON)	+= vmware_balloon.o
 obj-$(CONFIG_ARM_CHARLCD)	+= arm-charlcd.o
+obj-$(CONFIG_CMA_DEVICE)	+= cma-dev.o
diff --git a/drivers/misc/cma-dev.c b/drivers/misc/cma-dev.c
new file mode 100644
index 0000000..7d7bc05
--- /dev/null
+++ b/drivers/misc/cma-dev.c
@@ -0,0 +1,184 @@
+/*
+ * Contiguous Memory Allocator userspace driver
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+#define pr_fmt(fmt) "cma: " fmt
+
+#ifdef CONFIG_CMA_DEBUG
+#  define DEBUG
+#endif
+
+#include <linux/errno.h>       /* Error numbers */
+#include <linux/err.h>         /* IS_ERR_VALUE() */
+#include <linux/fs.h>          /* struct file */
+#include <linux/mm.h>          /* Memory stuff */
+#include <linux/mman.h>
+#include <linux/slab.h>
+#include <linux/module.h>      /* Standard module stuff */
+#include <linux/device.h>      /* struct device, dev_dbg() */
+#include <linux/types.h>       /* Just to be safe ;) */
+#include <linux/uaccess.h>     /* __copy_{to,from}_user */
+#include <linux/miscdevice.h>  /* misc_register() and company */
+
+#include <linux/cma.h>
+
+static int  cma_file_open(struct inode *inode, struct file *file);
+static int  cma_file_release(struct inode *inode, struct file *file);
+static long cma_file_ioctl(struct file *file, unsigned cmd, unsigned long arg);
+static int  cma_file_mmap(struct file *file, struct vm_area_struct *vma);
+
+
+static struct miscdevice cma_miscdev = {
+	.minor = MISC_DYNAMIC_MINOR,
+	.name  = "cma",
+	.fops  = &(const struct file_operations) {
+		.owner          = THIS_MODULE,
+		.open           = cma_file_open,
+		.release        = cma_file_release,
+		.unlocked_ioctl = cma_file_ioctl,
+		.mmap           = cma_file_mmap,
+	},
+};
+#define cma_dev (cma_miscdev.this_device)
+
+
+#define cma_file_start(file) (((dma_addr_t *)(file)->private_data)[0])
+#define cma_file_size(file)  (((dma_addr_t *)(file)->private_data)[1])
+
+
+static int  cma_file_open(struct inode *inode, struct file *file)
+{
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	file->private_data = NULL;
+
+	return 0;
+}
+
+
+static int  cma_file_release(struct inode *inode, struct file *file)
+{
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	if (file->private_data) {
+		cma_put(cma_file_start(file));
+		kfree(file->private_data);
+	}
+
+	return 0;
+}
+
+
+static long cma_file_ioctl(struct file *file, unsigned cmd, unsigned long arg)
+{
+	struct cma_alloc_request req;
+	struct device fake_device;
+	unsigned long addr;
+	long ret;
+
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	if (cmd != IOCTL_CMA_ALLOC)
+		return -ENOTTY;
+
+	if (!arg)
+		return -EINVAL;
+
+	if (file->private_data) /* Already allocated */
+		return -EBADFD;
+
+	if (copy_from_user(&req, (void *)arg, sizeof req))
+		return -EFAULT;
+
+	if (req.magic != CMA_MAGIC)
+		return -ENOTTY;
+
+	/* May happen on 32 bit system. */
+	if (req.size > ~(typeof(req.size))0 ||
+	    req.alignment > ~(typeof(req.alignment))0)
+		return -EINVAL;
+
+	if (strnlen(req.name, sizeof req.name) >= sizeof req.name
+	 || strnlen(req.kind, sizeof req.kind) >= sizeof req.kind)
+		return -EINVAL;
+
+	file->private_data = kmalloc(2 * sizeof(dma_addr_t), GFP_KERNEL);
+	if (!file->private_data)
+		return -ENOMEM;
+
+	fake_device.init_name = req.name;
+	fake_device.kobj.name = req.name;
+	addr = cma_alloc(&fake_device, req.kind, req.size, req.alignment);
+	if (IS_ERR_VALUE(addr)) {
+		ret = addr;
+		goto error_priv;
+	}
+
+	if (put_user(addr, (typeof(req.start) *)(arg + offsetof(typeof(req), start)))) {
+		ret = -EFAULT;
+		goto error_put;
+	}
+
+	cma_file_start(file) = addr;
+	cma_file_size(file) = req.size;
+
+	dev_dbg(cma_dev, "allocated %p@%p\n",
+		(void *)(dma_addr_t)req.size, (void *)addr);
+
+	return 0;
+
+error_put:
+	cma_put(addr);
+error_priv:
+	kfree(file->private_data);
+	file->private_data = NULL;
+	return ret;
+}
+
+
+static int  cma_file_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	unsigned long pgoff, offset, length;
+
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	if (!file->private_data)
+		return -EBADFD;
+
+	pgoff  = vma->vm_pgoff;
+	offset = pgoff << PAGE_SHIFT;
+	length = vma->vm_end - vma->vm_start;
+
+	if (offset          >= cma_file_size(file)
+	 || length          >  cma_file_size(file)
+	 || offset + length >  cma_file_size(file))
+		return -ENOSPC;
+
+	return remap_pfn_range(vma, vma->vm_start,
+			       __phys_to_pfn(cma_file_start(file) + offset),
+			       length, vma->vm_page_prot);
+}
+
+
+
+static int __init cma_dev_init(void)
+{
+	int ret = misc_register(&cma_miscdev);
+	pr_debug("miscdev: register returned: %d\n", ret);
+	return ret;
+}
+module_init(cma_dev_init);
+
+static void __exit cma_dev_exit(void)
+{
+	dev_dbg(cma_dev, "deregisterring\n");
+	misc_deregister(&cma_miscdev);
+}
+module_exit(cma_dev_exit);
diff --git a/include/linux/cma.h b/include/linux/cma.h
index 96a8652..36e73fb 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -11,6 +11,36 @@
  * See Documentation/contiguous-memory.txt for details.
  */
 
+#include <linux/ioctl.h>
+#include <linux/types.h>
+
+
+#define CMA_MAGIC (('c' << 24) | ('M' << 16) | ('a' << 8) | 0x42)
+
+/**
+ * An information about area exportable to user space.
+ * @magic: must always be CMA_MAGIC.
+ * @name:  name of the device to allocate as.
+ * @kind:  kind of the memory.
+ * @_pad:  reserved.
+ * @size:  size of the chunk to allocate.
+ * @alignment: desired alignment of the chunk (must be power of two or zero).
+ * @start: when ioctl() finishes this stores physical address of the chunk.
+ */
+struct cma_alloc_request {
+	__u32 magic;
+	char  name[17];
+	char  kind[17];
+	__u16 pad;
+	/* __u64 to be compatible accross 32 and 64 bit systems. */
+	__u64 size;
+	__u64 alignment;
+	__u64 start;
+};
+
+#define IOCTL_CMA_ALLOC    _IOWR('p', 0, struct cma_alloc_request)
+
+
 /***************************** Kernel lever API *****************************/
 
 #ifdef __KERNEL__
diff --git a/tools/cma/cma-test.c b/tools/cma/cma-test.c
new file mode 100644
index 0000000..567c57b
--- /dev/null
+++ b/tools/cma/cma-test.c
@@ -0,0 +1,373 @@
+/*
+ * cma-test.c -- CMA testing application
+ *
+ * Copyright (C) 2010 Samsung Electronics
+ *                    Author: Michal Nazarewicz <m.nazarewicz@samsung.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+/* $(CROSS_COMPILE)gcc -Wall -Wextra -g -o cma-test cma-test.c  */
+
+#include <linux/cma.h>
+
+#include <sys/ioctl.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+
+#include <fcntl.h>
+#include <unistd.h>
+
+#include <ctype.h>
+#include <errno.h>
+#include <limits.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+
+static void handle_command(char *line);
+
+int main(void)
+{
+	unsigned no = 1;
+	char line[1024];
+	int skip = 0;
+
+	fputs("commands:\n"
+	      " l or list                                list allocated chunks\n"
+	      " a or alloc  <name> <size>[/<alignment>]  allocate chunk\n"
+	      " f or free   [<num>]                      free an chunk\n"
+	      " # ...                                    comment\n"
+	      " <empty line>                             repeat previous\n"
+	      "\n", stderr);
+
+	while (fgets(line, sizeof line, stdin)) {
+		char *nl = strchr(line, '\n');
+		if (nl) {
+			if (skip) {
+				fprintf(stderr, "cma: %d: line too long\n", no);
+				skip = 0;
+			} else {
+				*nl = '\0';
+				handle_command(line);
+			}
+			++no;
+		} else {
+			skip = 1;
+		}
+	}
+
+	if (skip)
+		fprintf(stderr, "cma: %d: no new line at EOF\n", no);
+	return 0;
+}
+
+
+
+static void cmd_list(char *name, char *line);
+static void cmd_alloc(char *name, char *line);
+static void cmd_free(char *name, char *line);
+
+static const struct command {
+	const char name[8];
+	void (*handle)(char *name, char *line);
+} commands[] = {
+	{ "list",  cmd_list  },
+	{ "l",     cmd_list  },
+	{ "alloc", cmd_alloc },
+	{ "a",     cmd_alloc },
+	{ "free",  cmd_free  },
+	{ "f",     cmd_free  },
+	{ "",      NULL      }
+};
+
+
+#define SKIP_SPACE(ch) do while (isspace(*(ch))) ++(ch); while (0)
+
+
+static void handle_command(char *line)
+{
+	static char last_line[1024];
+
+	const struct command *cmd;
+	char *name;
+
+	SKIP_SPACE(line);
+	if (*line == '#')
+		return;
+
+	if (!*line)
+		strcpy(line, last_line);
+	else
+		strcpy(last_line, line);
+
+	name = line;
+	while (*line && !isspace(*line))
+		++line;
+
+	if (*line) {
+		*line = '\0';
+		++line;
+	}
+
+	for (cmd = commands; *(cmd->name); ++cmd)
+		if (!strcmp(name, cmd->name)) {
+			cmd->handle(name, line);
+			return;
+		}
+
+	fprintf(stderr, "%s: unknown command\n", name);
+}
+
+
+
+struct chunk {
+	struct chunk *next, *prev;
+	int fd;
+	unsigned long size;
+	unsigned long start;
+};
+
+static struct chunk root = {
+	.next = &root,
+	.prev = &root,
+};
+
+#define for_each(a) for (a = root.next; a != &root; a = a->next)
+
+static struct chunk *chunk_create(const char *prefix);
+static void chunk_destroy(struct chunk *chunk);
+static void chunk_add(struct chunk *chunk);
+
+static int memparse(char *ptr, char **retptr, unsigned long *ret);
+
+
+static void cmd_list(char *name, char *line)
+{
+	struct chunk *chunk;
+
+	(void)name; (void)line;
+
+	for_each(chunk)
+		printf("%3d: %p@%p\n", chunk->fd,
+		       (void *)chunk->size, (void *)chunk->start);
+}
+
+
+static void cmd_alloc(char *name, char *line)
+{
+	unsigned long size, alignment = 0;
+	struct cma_alloc_request req;
+	char *dev, *kind = NULL;
+	struct chunk *chunk;
+	int ret;
+
+	SKIP_SPACE(line);
+	if (!*line) {
+		fprintf(stderr, "%s: expecting name\n", name);
+		return;
+	}
+
+	for (dev = line; *line && !isspace(*line); ++line)
+		if (*line == '/')
+			kind = line;
+
+	if (!*line) {
+		fprintf(stderr, "%s: expecting size after name\n", name);
+		return;
+	}
+
+	if (kind)
+		*kind++ = '\0';
+	*line++ = '\0';
+
+	if (( kind && (size_t)(kind - dev ) > sizeof req.name)
+	 || (!kind && (size_t)(line - dev ) > sizeof req.name)
+	 || ( kind && (size_t)(line - kind) > sizeof req.kind)) {
+		fprintf(stderr, "%s: name or kind too long\n", name);
+		return;
+	}
+
+
+	if (memparse(line, &line, &size) < 0 || !size) {
+		fprintf(stderr, "%s: invalid size\n", name);
+		return;
+	}
+
+	if (*line == '/')
+		if (memparse(line, &line, &alignment) < 0) {
+			fprintf(stderr, "%s: invalid alignment\n", name);
+			return;
+		}
+
+	SKIP_SPACE(line);
+	if (*line) {
+		fprintf(stderr, "%s: unknown arguments at the end: %s\n",
+			name, line);
+		return;
+	}
+
+
+	chunk = chunk_create(name);
+	if (!chunk)
+		return;
+
+	fprintf(stderr, "%s: allocating %p/%p\n", name,
+		(void *)size, (void *)alignment);
+
+	req.magic     = CMA_MAGIC;
+	req.size      = size;
+	req.alignment = alignment;
+
+	strcpy(req.name, dev);
+	if (kind)
+		strcpy(req.kind, kind);
+	else
+		req.kind[0] = '\0';
+
+
+	ret = ioctl(chunk->fd, IOCTL_CMA_ALLOC, &req);
+	if (ret < 0) {
+		fprintf(stderr, "%s: cma_alloc: %s\n", name, strerror(errno));
+		chunk_destroy(chunk);
+	} else {
+		chunk_add(chunk);
+		chunk->size  = req.size;
+		chunk->start = req.start;
+
+		printf("%3d: %p@%p\n", chunk->fd,
+		       (void *)chunk->size, (void *)chunk->start);
+	}
+}
+
+
+static void cmd_free(char *name, char *line)
+{
+	struct chunk *chunk;
+
+	SKIP_SPACE(line);
+
+	if (*line) {
+		unsigned long num;
+
+		errno = 0;
+		num = strtoul(line, &line, 10);
+
+		if (errno || num > INT_MAX) {
+			fprintf(stderr, "%s: invalid number\n", name);
+			return;
+		}
+
+		SKIP_SPACE(line);
+		if (*line) {
+			fprintf(stderr, "%s: unknown arguments at the end: %s\n",
+				name, line);
+			return;
+		}
+
+		for_each(chunk)
+			if (chunk->fd == (int)num)
+				goto ok;
+		fprintf(stderr, "%s: no chunk %3lu\n", name, num);
+		return;
+
+	} else {
+		chunk = root.prev;
+		if (chunk == &root) {
+			fprintf(stderr, "%s: no chunks\n", name);
+			return;
+		}
+	}
+
+ok:
+	fprintf(stderr, "%s: freeing %p@%p\n", name,
+		(void *)chunk->size, (void *)chunk->start);
+	chunk_destroy(chunk);
+}
+
+
+static struct chunk *chunk_create(const char *prefix)
+{
+	struct chunk *chunk;
+	int fd;
+
+	chunk = malloc(sizeof *chunk);
+	if (!chunk) {
+		fprintf(stderr, "%s: %s\n", prefix, strerror(errno));
+		return NULL;
+	}
+
+	fd = open("/dev/cma", O_RDWR);
+	if (fd < 0) {
+		fprintf(stderr, "%s: /dev/cma: %s\n", prefix, strerror(errno));
+		return NULL;
+	}
+
+	chunk->prev = chunk;
+	chunk->next = chunk;
+	chunk->fd   = fd;
+	return chunk;
+}
+
+static void chunk_destroy(struct chunk *chunk)
+{
+	chunk->prev->next = chunk->next;
+	chunk->next->prev = chunk->prev;
+	close(chunk->fd);
+}
+
+static void chunk_add(struct chunk *chunk)
+{
+	chunk->next = &root;
+	chunk->prev = root.prev;
+	root.prev->next = chunk;
+	root.prev = chunk;
+}
+
+
+
+static int memparse(char *ptr, char **retptr, unsigned long *ret)
+{
+	unsigned long val;
+
+	SKIP_SPACE(ptr);
+
+	errno = 0;
+	val = strtoul(ptr, &ptr, 0);
+	if (errno)
+		return -1;
+
+	switch (*ptr) {
+	case 'G':
+	case 'g':
+		val <<= 10;
+	case 'M':
+	case 'm':
+		val <<= 10;
+	case 'K':
+	case 'k':
+		val <<= 10;
+		++ptr;
+	}
+
+	if (retptr) {
+		SKIP_SPACE(ptr);
+		*retptr = ptr;
+	}
+
+	*ret = val;
+	return 0;
+}
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCHv2 3/4] mm: cma: Test device and application added
@ 2010-07-26 14:11       ` Michal Nazarewicz
  0 siblings, 0 replies; 57+ messages in thread
From: Michal Nazarewicz @ 2010-07-26 14:11 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-media, linux-arm-kernel, Hiremath Vaibhav,
	Marek Szyprowski, Pawel Osciak, Mark Brown, Daniel Walker,
	Jonathan Corbet, FUJITA Tomonori, Zach Pfeffer, Kyungmin Park,
	Michal Nazarewicz

This patch adds a "cma" misc device which lets user space use the
CMA API.  This device is meant for testing.  A testing application
is also provided.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/misc/Kconfig   |    8 +
 drivers/misc/Makefile  |    1 +
 drivers/misc/cma-dev.c |  184 ++++++++++++++++++++++++
 include/linux/cma.h    |   30 ++++
 tools/cma/cma-test.c   |  373 ++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 596 insertions(+), 0 deletions(-)
 create mode 100644 drivers/misc/cma-dev.c
 create mode 100644 tools/cma/cma-test.c

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 9b089df..6ae3d9f 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -368,4 +368,12 @@ source "drivers/misc/eeprom/Kconfig"
 source "drivers/misc/cb710/Kconfig"
 source "drivers/misc/iwmc3200top/Kconfig"
 
+config CMA_DEVICE
+	tristate "CMA misc device (DEVELOPEMENT)"
+	depends on CMA
+	help
+	  The CMA misc device allows allocating contiguous memory areas
+	  from user space.  This is mostly for testing of the CMA
+	  framework.
+
 endif # MISC_DEVICES
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 67552d6..9921370 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -32,3 +32,4 @@ obj-y				+= eeprom/
 obj-y				+= cb710/
 obj-$(CONFIG_VMWARE_BALLOON)	+= vmware_balloon.o
 obj-$(CONFIG_ARM_CHARLCD)	+= arm-charlcd.o
+obj-$(CONFIG_CMA_DEVICE)	+= cma-dev.o
diff --git a/drivers/misc/cma-dev.c b/drivers/misc/cma-dev.c
new file mode 100644
index 0000000..7d7bc05
--- /dev/null
+++ b/drivers/misc/cma-dev.c
@@ -0,0 +1,184 @@
+/*
+ * Contiguous Memory Allocator userspace driver
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz@samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+#define pr_fmt(fmt) "cma: " fmt
+
+#ifdef CONFIG_CMA_DEBUG
+#  define DEBUG
+#endif
+
+#include <linux/errno.h>       /* Error numbers */
+#include <linux/err.h>         /* IS_ERR_VALUE() */
+#include <linux/fs.h>          /* struct file */
+#include <linux/mm.h>          /* Memory stuff */
+#include <linux/mman.h>
+#include <linux/slab.h>
+#include <linux/module.h>      /* Standard module stuff */
+#include <linux/device.h>      /* struct device, dev_dbg() */
+#include <linux/types.h>       /* Just to be safe ;) */
+#include <linux/uaccess.h>     /* __copy_{to,from}_user */
+#include <linux/miscdevice.h>  /* misc_register() and company */
+
+#include <linux/cma.h>
+
+static int  cma_file_open(struct inode *inode, struct file *file);
+static int  cma_file_release(struct inode *inode, struct file *file);
+static long cma_file_ioctl(struct file *file, unsigned cmd, unsigned long arg);
+static int  cma_file_mmap(struct file *file, struct vm_area_struct *vma);
+
+
+static struct miscdevice cma_miscdev = {
+	.minor = MISC_DYNAMIC_MINOR,
+	.name  = "cma",
+	.fops  = &(const struct file_operations) {
+		.owner          = THIS_MODULE,
+		.open           = cma_file_open,
+		.release        = cma_file_release,
+		.unlocked_ioctl = cma_file_ioctl,
+		.mmap           = cma_file_mmap,
+	},
+};
+#define cma_dev (cma_miscdev.this_device)
+
+
+#define cma_file_start(file) (((dma_addr_t *)(file)->private_data)[0])
+#define cma_file_size(file)  (((dma_addr_t *)(file)->private_data)[1])
+
+
+static int  cma_file_open(struct inode *inode, struct file *file)
+{
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	file->private_data = NULL;
+
+	return 0;
+}
+
+
+static int  cma_file_release(struct inode *inode, struct file *file)
+{
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	if (file->private_data) {
+		cma_put(cma_file_start(file));
+		kfree(file->private_data);
+	}
+
+	return 0;
+}
+
+
+static long cma_file_ioctl(struct file *file, unsigned cmd, unsigned long arg)
+{
+	struct cma_alloc_request req;
+	struct device fake_device;
+	unsigned long addr;
+	long ret;
+
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	if (cmd != IOCTL_CMA_ALLOC)
+		return -ENOTTY;
+
+	if (!arg)
+		return -EINVAL;
+
+	if (file->private_data) /* Already allocated */
+		return -EBADFD;
+
+	if (copy_from_user(&req, (void *)arg, sizeof req))
+		return -EFAULT;
+
+	if (req.magic != CMA_MAGIC)
+		return -ENOTTY;
+
+	/* May happen on 32 bit system. */
+	if (req.size > ~(typeof(req.size))0 ||
+	    req.alignment > ~(typeof(req.alignment))0)
+		return -EINVAL;
+
+	if (strnlen(req.name, sizeof req.name) >= sizeof req.name
+	 || strnlen(req.kind, sizeof req.kind) >= sizeof req.kind)
+		return -EINVAL;
+
+	file->private_data = kmalloc(2 * sizeof(dma_addr_t), GFP_KERNEL);
+	if (!file->private_data)
+		return -ENOMEM;
+
+	fake_device.init_name = req.name;
+	fake_device.kobj.name = req.name;
+	addr = cma_alloc(&fake_device, req.kind, req.size, req.alignment);
+	if (IS_ERR_VALUE(addr)) {
+		ret = addr;
+		goto error_priv;
+	}
+
+	if (put_user(addr, (typeof(req.start) *)(arg + offsetof(typeof(req), start)))) {
+		ret = -EFAULT;
+		goto error_put;
+	}
+
+	cma_file_start(file) = addr;
+	cma_file_size(file) = req.size;
+
+	dev_dbg(cma_dev, "allocated %p@%p\n",
+		(void *)(dma_addr_t)req.size, (void *)addr);
+
+	return 0;
+
+error_put:
+	cma_put(addr);
+error_priv:
+	kfree(file->private_data);
+	file->private_data = NULL;
+	return ret;
+}
+
+
+static int  cma_file_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	unsigned long pgoff, offset, length;
+
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	if (!file->private_data)
+		return -EBADFD;
+
+	pgoff  = vma->vm_pgoff;
+	offset = pgoff << PAGE_SHIFT;
+	length = vma->vm_end - vma->vm_start;
+
+	if (offset          >= cma_file_size(file)
+	 || length          >  cma_file_size(file)
+	 || offset + length >  cma_file_size(file))
+		return -ENOSPC;
+
+	return remap_pfn_range(vma, vma->vm_start,
+			       __phys_to_pfn(cma_file_start(file) + offset),
+			       length, vma->vm_page_prot);
+}
+
+
+
+static int __init cma_dev_init(void)
+{
+	int ret = misc_register(&cma_miscdev);
+	pr_debug("miscdev: register returned: %d\n", ret);
+	return ret;
+}
+module_init(cma_dev_init);
+
+static void __exit cma_dev_exit(void)
+{
+	dev_dbg(cma_dev, "deregisterring\n");
+	misc_deregister(&cma_miscdev);
+}
+module_exit(cma_dev_exit);
diff --git a/include/linux/cma.h b/include/linux/cma.h
index 96a8652..36e73fb 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -11,6 +11,36 @@
  * See Documentation/contiguous-memory.txt for details.
  */
 
+#include <linux/ioctl.h>
+#include <linux/types.h>
+
+
+#define CMA_MAGIC (('c' << 24) | ('M' << 16) | ('a' << 8) | 0x42)
+
+/**
+ * An information about area exportable to user space.
+ * @magic: must always be CMA_MAGIC.
+ * @name:  name of the device to allocate as.
+ * @kind:  kind of the memory.
+ * @_pad:  reserved.
+ * @size:  size of the chunk to allocate.
+ * @alignment: desired alignment of the chunk (must be power of two or zero).
+ * @start: when ioctl() finishes this stores physical address of the chunk.
+ */
+struct cma_alloc_request {
+	__u32 magic;
+	char  name[17];
+	char  kind[17];
+	__u16 pad;
+	/* __u64 to be compatible accross 32 and 64 bit systems. */
+	__u64 size;
+	__u64 alignment;
+	__u64 start;
+};
+
+#define IOCTL_CMA_ALLOC    _IOWR('p', 0, struct cma_alloc_request)
+
+
 /***************************** Kernel lever API *****************************/
 
 #ifdef __KERNEL__
diff --git a/tools/cma/cma-test.c b/tools/cma/cma-test.c
new file mode 100644
index 0000000..567c57b
--- /dev/null
+++ b/tools/cma/cma-test.c
@@ -0,0 +1,373 @@
+/*
+ * cma-test.c -- CMA testing application
+ *
+ * Copyright (C) 2010 Samsung Electronics
+ *                    Author: Michal Nazarewicz <m.nazarewicz@samsung.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+/* $(CROSS_COMPILE)gcc -Wall -Wextra -g -o cma-test cma-test.c  */
+
+#include <linux/cma.h>
+
+#include <sys/ioctl.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+
+#include <fcntl.h>
+#include <unistd.h>
+
+#include <ctype.h>
+#include <errno.h>
+#include <limits.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+
+static void handle_command(char *line);
+
+int main(void)
+{
+	unsigned no = 1;
+	char line[1024];
+	int skip = 0;
+
+	fputs("commands:\n"
+	      " l or list                                list allocated chunks\n"
+	      " a or alloc  <name> <size>[/<alignment>]  allocate chunk\n"
+	      " f or free   [<num>]                      free an chunk\n"
+	      " # ...                                    comment\n"
+	      " <empty line>                             repeat previous\n"
+	      "\n", stderr);
+
+	while (fgets(line, sizeof line, stdin)) {
+		char *nl = strchr(line, '\n');
+		if (nl) {
+			if (skip) {
+				fprintf(stderr, "cma: %d: line too long\n", no);
+				skip = 0;
+			} else {
+				*nl = '\0';
+				handle_command(line);
+			}
+			++no;
+		} else {
+			skip = 1;
+		}
+	}
+
+	if (skip)
+		fprintf(stderr, "cma: %d: no new line at EOF\n", no);
+	return 0;
+}
+
+
+
+static void cmd_list(char *name, char *line);
+static void cmd_alloc(char *name, char *line);
+static void cmd_free(char *name, char *line);
+
+static const struct command {
+	const char name[8];
+	void (*handle)(char *name, char *line);
+} commands[] = {
+	{ "list",  cmd_list  },
+	{ "l",     cmd_list  },
+	{ "alloc", cmd_alloc },
+	{ "a",     cmd_alloc },
+	{ "free",  cmd_free  },
+	{ "f",     cmd_free  },
+	{ "",      NULL      }
+};
+
+
+#define SKIP_SPACE(ch) do while (isspace(*(ch))) ++(ch); while (0)
+
+
+static void handle_command(char *line)
+{
+	static char last_line[1024];
+
+	const struct command *cmd;
+	char *name;
+
+	SKIP_SPACE(line);
+	if (*line == '#')
+		return;
+
+	if (!*line)
+		strcpy(line, last_line);
+	else
+		strcpy(last_line, line);
+
+	name = line;
+	while (*line && !isspace(*line))
+		++line;
+
+	if (*line) {
+		*line = '\0';
+		++line;
+	}
+
+	for (cmd = commands; *(cmd->name); ++cmd)
+		if (!strcmp(name, cmd->name)) {
+			cmd->handle(name, line);
+			return;
+		}
+
+	fprintf(stderr, "%s: unknown command\n", name);
+}
+
+
+
+struct chunk {
+	struct chunk *next, *prev;
+	int fd;
+	unsigned long size;
+	unsigned long start;
+};
+
+static struct chunk root = {
+	.next = &root,
+	.prev = &root,
+};
+
+#define for_each(a) for (a = root.next; a != &root; a = a->next)
+
+static struct chunk *chunk_create(const char *prefix);
+static void chunk_destroy(struct chunk *chunk);
+static void chunk_add(struct chunk *chunk);
+
+static int memparse(char *ptr, char **retptr, unsigned long *ret);
+
+
+static void cmd_list(char *name, char *line)
+{
+	struct chunk *chunk;
+
+	(void)name; (void)line;
+
+	for_each(chunk)
+		printf("%3d: %p@%p\n", chunk->fd,
+		       (void *)chunk->size, (void *)chunk->start);
+}
+
+
+static void cmd_alloc(char *name, char *line)
+{
+	unsigned long size, alignment = 0;
+	struct cma_alloc_request req;
+	char *dev, *kind = NULL;
+	struct chunk *chunk;
+	int ret;
+
+	SKIP_SPACE(line);
+	if (!*line) {
+		fprintf(stderr, "%s: expecting name\n", name);
+		return;
+	}
+
+	for (dev = line; *line && !isspace(*line); ++line)
+		if (*line == '/')
+			kind = line;
+
+	if (!*line) {
+		fprintf(stderr, "%s: expecting size after name\n", name);
+		return;
+	}
+
+	if (kind)
+		*kind++ = '\0';
+	*line++ = '\0';
+
+	if (( kind && (size_t)(kind - dev ) > sizeof req.name)
+	 || (!kind && (size_t)(line - dev ) > sizeof req.name)
+	 || ( kind && (size_t)(line - kind) > sizeof req.kind)) {
+		fprintf(stderr, "%s: name or kind too long\n", name);
+		return;
+	}
+
+
+	if (memparse(line, &line, &size) < 0 || !size) {
+		fprintf(stderr, "%s: invalid size\n", name);
+		return;
+	}
+
+	if (*line == '/')
+		if (memparse(line, &line, &alignment) < 0) {
+			fprintf(stderr, "%s: invalid alignment\n", name);
+			return;
+		}
+
+	SKIP_SPACE(line);
+	if (*line) {
+		fprintf(stderr, "%s: unknown arguments at the end: %s\n",
+			name, line);
+		return;
+	}
+
+
+	chunk = chunk_create(name);
+	if (!chunk)
+		return;
+
+	fprintf(stderr, "%s: allocating %p/%p\n", name,
+		(void *)size, (void *)alignment);
+
+	req.magic     = CMA_MAGIC;
+	req.size      = size;
+	req.alignment = alignment;
+
+	strcpy(req.name, dev);
+	if (kind)
+		strcpy(req.kind, kind);
+	else
+		req.kind[0] = '\0';
+
+
+	ret = ioctl(chunk->fd, IOCTL_CMA_ALLOC, &req);
+	if (ret < 0) {
+		fprintf(stderr, "%s: cma_alloc: %s\n", name, strerror(errno));
+		chunk_destroy(chunk);
+	} else {
+		chunk_add(chunk);
+		chunk->size  = req.size;
+		chunk->start = req.start;
+
+		printf("%3d: %p@%p\n", chunk->fd,
+		       (void *)chunk->size, (void *)chunk->start);
+	}
+}
+
+
+static void cmd_free(char *name, char *line)
+{
+	struct chunk *chunk;
+
+	SKIP_SPACE(line);
+
+	if (*line) {
+		unsigned long num;
+
+		errno = 0;
+		num = strtoul(line, &line, 10);
+
+		if (errno || num > INT_MAX) {
+			fprintf(stderr, "%s: invalid number\n", name);
+			return;
+		}
+
+		SKIP_SPACE(line);
+		if (*line) {
+			fprintf(stderr, "%s: unknown arguments at the end: %s\n",
+				name, line);
+			return;
+		}
+
+		for_each(chunk)
+			if (chunk->fd == (int)num)
+				goto ok;
+		fprintf(stderr, "%s: no chunk %3lu\n", name, num);
+		return;
+
+	} else {
+		chunk = root.prev;
+		if (chunk == &root) {
+			fprintf(stderr, "%s: no chunks\n", name);
+			return;
+		}
+	}
+
+ok:
+	fprintf(stderr, "%s: freeing %p@%p\n", name,
+		(void *)chunk->size, (void *)chunk->start);
+	chunk_destroy(chunk);
+}
+
+
+static struct chunk *chunk_create(const char *prefix)
+{
+	struct chunk *chunk;
+	int fd;
+
+	chunk = malloc(sizeof *chunk);
+	if (!chunk) {
+		fprintf(stderr, "%s: %s\n", prefix, strerror(errno));
+		return NULL;
+	}
+
+	fd = open("/dev/cma", O_RDWR);
+	if (fd < 0) {
+		fprintf(stderr, "%s: /dev/cma: %s\n", prefix, strerror(errno));
+		return NULL;
+	}
+
+	chunk->prev = chunk;
+	chunk->next = chunk;
+	chunk->fd   = fd;
+	return chunk;
+}
+
+static void chunk_destroy(struct chunk *chunk)
+{
+	chunk->prev->next = chunk->next;
+	chunk->next->prev = chunk->prev;
+	close(chunk->fd);
+}
+
+static void chunk_add(struct chunk *chunk)
+{
+	chunk->next = &root;
+	chunk->prev = root.prev;
+	root.prev->next = chunk;
+	root.prev = chunk;
+}
+
+
+
+static int memparse(char *ptr, char **retptr, unsigned long *ret)
+{
+	unsigned long val;
+
+	SKIP_SPACE(ptr);
+
+	errno = 0;
+	val = strtoul(ptr, &ptr, 0);
+	if (errno)
+		return -1;
+
+	switch (*ptr) {
+	case 'G':
+	case 'g':
+		val <<= 10;
+	case 'M':
+	case 'm':
+		val <<= 10;
+	case 'K':
+	case 'k':
+		val <<= 10;
+		++ptr;
+	}
+
+	if (retptr) {
+		SKIP_SPACE(ptr);
+		*retptr = ptr;
+	}
+
+	*ret = val;
+	return 0;
+}
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCHv2 4/4] arm: Added CMA to Aquila and Goni
  2010-07-26 14:11       ` Michal Nazarewicz
  (?)
@ 2010-07-26 14:11         ` Michal Nazarewicz
  -1 siblings, 0 replies; 57+ messages in thread
From: Michal Nazarewicz @ 2010-07-26 14:11 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-media, linux-arm-kernel, Hiremath Vaibhav,
	Marek Szyprowski, Pawel Osciak, Mark Brown, Daniel Walker,
	Jonathan Corbet, FUJITA Tomonori, Zach Pfeffer, Kyungmin Park,
	Michal Nazarewicz

Added the CMA initialisation code to two Samsung platforms.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/mach-s5pv210/mach-aquila.c |   13 +++++++++++++
 arch/arm/mach-s5pv210/mach-goni.c   |   13 +++++++++++++
 2 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/arch/arm/mach-s5pv210/mach-aquila.c b/arch/arm/mach-s5pv210/mach-aquila.c
index 0992618..ab156f9 100644
--- a/arch/arm/mach-s5pv210/mach-aquila.c
+++ b/arch/arm/mach-s5pv210/mach-aquila.c
@@ -19,6 +19,7 @@
 #include <linux/gpio_keys.h>
 #include <linux/input.h>
 #include <linux/gpio.h>
+#include <linux/cma.h>
 
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
@@ -454,6 +455,17 @@ static void __init aquila_map_io(void)
 	s3c24xx_init_uarts(aquila_uartcfgs, ARRAY_SIZE(aquila_uartcfgs));
 }
 
+static void __init aquila_reserve(void)
+{
+	static char regions[] __initdata =
+		"-mfc_fw=1M/128K;mfc_b1=32M;mfc_b2=16M@0x40000000";
+	static char map[] __initdata =
+		"s3c-mfc5/f=mfc_fw;s3c-mfc5/a=mfc_b1;s3c-mfc5/b=mfc_b2";
+
+	cma_set_defaults(regions, map, NULL);
+	cma_early_regions_reserve(NULL);
+}
+
 static void __init aquila_machine_init(void)
 {
 	/* PMIC */
@@ -478,4 +490,5 @@ MACHINE_START(AQUILA, "Aquila")
 	.map_io		= aquila_map_io,
 	.init_machine	= aquila_machine_init,
 	.timer		= &s3c24xx_timer,
+	.reserve	= aquila_reserve,
 MACHINE_END
diff --git a/arch/arm/mach-s5pv210/mach-goni.c b/arch/arm/mach-s5pv210/mach-goni.c
index 7b18505..2b0a349 100644
--- a/arch/arm/mach-s5pv210/mach-goni.c
+++ b/arch/arm/mach-s5pv210/mach-goni.c
@@ -19,6 +19,7 @@
 #include <linux/gpio_keys.h>
 #include <linux/input.h>
 #include <linux/gpio.h>
+#include <linux/cma.h>
 
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
@@ -435,6 +436,17 @@ static void __init goni_map_io(void)
 	s3c24xx_init_uarts(goni_uartcfgs, ARRAY_SIZE(goni_uartcfgs));
 }
 
+static void __init goni_reserve(void)
+{
+	static char regions[] __initdata =
+		"-mfc_fw=1M/128K;mfc_b1=32M;mfc_b2=16M@0x40000000";
+	static char map[] __initdata =
+		"s3c-mfc5/f=mfc_fw;s3c-mfc5/a=mfc_b1;s3c-mfc5/b=mfc_b2";
+
+	cma_set_defaults(regions, map, NULL);
+	cma_early_regions_reserve(NULL);
+}
+
 static void __init goni_machine_init(void)
 {
 	/* PMIC */
@@ -456,4 +468,5 @@ MACHINE_START(GONI, "GONI")
 	.map_io		= goni_map_io,
 	.init_machine	= goni_machine_init,
 	.timer		= &s3c24xx_timer,
+	.reserve	= goni_reserve,
 MACHINE_END
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCHv2 4/4] arm: Added CMA to Aquila and Goni
@ 2010-07-26 14:11         ` Michal Nazarewicz
  0 siblings, 0 replies; 57+ messages in thread
From: Michal Nazarewicz @ 2010-07-26 14:11 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, linux-media, linux-arm-kernel, Hiremath Vaibhav,
	Marek Szyprowski, Pawel Osciak, Mark Brown, Daniel Walker,
	Jonathan Corbet, FUJITA Tomonori, Zach Pfeffer, Kyungmin Park,
	Michal Nazarewicz

Added the CMA initialisation code to two Samsung platforms.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/mach-s5pv210/mach-aquila.c |   13 +++++++++++++
 arch/arm/mach-s5pv210/mach-goni.c   |   13 +++++++++++++
 2 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/arch/arm/mach-s5pv210/mach-aquila.c b/arch/arm/mach-s5pv210/mach-aquila.c
index 0992618..ab156f9 100644
--- a/arch/arm/mach-s5pv210/mach-aquila.c
+++ b/arch/arm/mach-s5pv210/mach-aquila.c
@@ -19,6 +19,7 @@
 #include <linux/gpio_keys.h>
 #include <linux/input.h>
 #include <linux/gpio.h>
+#include <linux/cma.h>
 
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
@@ -454,6 +455,17 @@ static void __init aquila_map_io(void)
 	s3c24xx_init_uarts(aquila_uartcfgs, ARRAY_SIZE(aquila_uartcfgs));
 }
 
+static void __init aquila_reserve(void)
+{
+	static char regions[] __initdata =
+		"-mfc_fw=1M/128K;mfc_b1=32M;mfc_b2=16M@0x40000000";
+	static char map[] __initdata =
+		"s3c-mfc5/f=mfc_fw;s3c-mfc5/a=mfc_b1;s3c-mfc5/b=mfc_b2";
+
+	cma_set_defaults(regions, map, NULL);
+	cma_early_regions_reserve(NULL);
+}
+
 static void __init aquila_machine_init(void)
 {
 	/* PMIC */
@@ -478,4 +490,5 @@ MACHINE_START(AQUILA, "Aquila")
 	.map_io		= aquila_map_io,
 	.init_machine	= aquila_machine_init,
 	.timer		= &s3c24xx_timer,
+	.reserve	= aquila_reserve,
 MACHINE_END
diff --git a/arch/arm/mach-s5pv210/mach-goni.c b/arch/arm/mach-s5pv210/mach-goni.c
index 7b18505..2b0a349 100644
--- a/arch/arm/mach-s5pv210/mach-goni.c
+++ b/arch/arm/mach-s5pv210/mach-goni.c
@@ -19,6 +19,7 @@
 #include <linux/gpio_keys.h>
 #include <linux/input.h>
 #include <linux/gpio.h>
+#include <linux/cma.h>
 
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
@@ -435,6 +436,17 @@ static void __init goni_map_io(void)
 	s3c24xx_init_uarts(goni_uartcfgs, ARRAY_SIZE(goni_uartcfgs));
 }
 
+static void __init goni_reserve(void)
+{
+	static char regions[] __initdata =
+		"-mfc_fw=1M/128K;mfc_b1=32M;mfc_b2=16M@0x40000000";
+	static char map[] __initdata =
+		"s3c-mfc5/f=mfc_fw;s3c-mfc5/a=mfc_b1;s3c-mfc5/b=mfc_b2";
+
+	cma_set_defaults(regions, map, NULL);
+	cma_early_regions_reserve(NULL);
+}
+
 static void __init goni_machine_init(void)
 {
 	/* PMIC */
@@ -456,4 +468,5 @@ MACHINE_START(GONI, "GONI")
 	.map_io		= goni_map_io,
 	.init_machine	= goni_machine_init,
 	.timer		= &s3c24xx_timer,
+	.reserve	= goni_reserve,
 MACHINE_END
-- 
1.7.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCHv2 0/4] The Contiguous Memory Allocator
@ 2010-07-26 14:11 ` Michal Nazarewicz
  0 siblings, 0 replies; 57+ messages in thread
From: Michal Nazarewicz @ 2010-07-26 14:40 UTC (permalink / raw)
  To: linux-arm-kernel

(Sorry, fixing Cc.  Apparently I had an old ARM kernel list
address in my contacts.  Disregard previous mails when
replaying, please.  Once again, sorry for excessive spamming.)

Hello everyone,

The following patchset implements a Contiguous Memory Allocator.  For
those who have not yet stumbled across CMA an excerpt from
documentation:

   The Contiguous Memory Allocator (CMA) is a framework, which allows
   setting up a machine-specific configuration for physically-contiguous
   memory management. Memory for devices is then allocated according
   to that configuration.

   The main role of the framework is not to allocate memory, but to
   parse and manage memory configurations, as well as to act as an
   in-between between device drivers and pluggable allocators. It is
   thus not tied to any memory allocation method or strategy.

For more information please refer to the second patch from the
patchset which contains the documentation.


This is the second version of the patchset.  All of the changes are
concentrated in the second patch -- the other patches are almost
identical.

Major observable changes are:

1. The "cma_map" command line have been removed.  In exchange, a SysFS
   entry has been created under kernel/mm/contiguous.
   
   The configuration strings passed to CMA are now called attributes
   in the documentation.

   The intended way of specifying the attributes is
   a cma_set_defaults() function called by platform initialisation
   code.  "regions" attribute (the string specified by "cma" command
   line parameter) can be overwritten with command line parameter; the
   other attributes can be changed during run-time using the SysFS
   entries.

   (I still believe that the other attributes should have their own
   command line arguments as well but since they posed a lot of
   controversy (and many stopped reading after encountering them)
   "cma_map" have been removed.)

2. The behaviour of the "map" attribute has been modified slightly.
   Currently, if no rule matches given device it is assigned regions
   specified by the "asterisk" attribute.  It is by default built from
   the region names given in "regions" attribute.

   This also means that if no "map" is specified all devices use all
   the regions specified in the "regions" attribute.  This should be
   a handy default.

3. Devices can register private regions as well as regions that can be
   shared but are not reserved using standard CMA mechanisms.
   A private region has no name and can be accessed only by devices
   that have the pointer to it.

   Moreover, if device manages to run its code early enough it can
   register an "early region".  An early region is one memory has not
   been reserved for.  At one point, platform initialisation code
   reserves memory for all registered early regions and if this
   succeeds those regions are registered as normal regions that can be
   used with the standard API.  This may be handy for devices that
   need some private region but don't want to worry about reserving
   it.

4. The way allocators are registered has changed.  Currently,
   a cma_allocator_register() function is used for that purpose.
   Moreover, allocators are attached to regions the first time memory
   is registered from the region or when allocator is registered which
   means that allocators can be dynamic modules that are loaded after
   the kernel booted (of course, it won't be possible to allocate
   a chunk of memory from a region if allocator is not loaded).


Index of new functions:

+static inline dma_addr_t __must_check
+cma_alloc_from(const char *regions, size_t size, dma_addr_t alignment)

+static inline int
+cma_info_about(struct cma_info *info, const const char *regions)

+int __must_check cma_region_register(struct cma_region *reg);

+dma_addr_t __must_check
+cma_alloc_from_region(struct cma_region *reg,
+		      size_t size, dma_addr_t alignment);

+static inline dma_addr_t __must_check
+cma_alloc_from(const char *regions,
+               size_t size, dma_addr_t alignment);

+int cma_allocator_register(struct cma_allocator *alloc);


The patches in the patchset include:

Michal Nazarewicz (4):
  lib: rbtree: rb_root_init() function added

    The rb_root_init() function initialises an RB tree with a single
    node placed in the root.  This is more convenient then
    initialising an empty tree and then adding an element.

  mm: cma: Contiguous Memory Allocator added

    This patch is the main patchset that implements the CMA framework
    including the best-fit allocator.  It also adds a documentation.

  mm: cma: Test device and application added

    This patch adds a misc device that works as a proxy to the CMA
    framework and a simple testing application.  This lets one test
    the whole framework from user space as well as reply an recorded
    allocate/free sequence.

  arm: Added CMA to Aquila and Goni

    This patch adds the CMA platform initialisation code to two ARM
    platforms.  It serves as an example of how this is achieved.

 Documentation/00-INDEX                             |    2 +
 .../ABI/testing/sysfs-kernel-mm-contiguous         |    9 +
 Documentation/contiguous-memory.txt                |  646 +++++++++++
 Documentation/kernel-parameters.txt                |    4 +
 arch/arm/mach-s5pv210/mach-aquila.c                |   13 +
 arch/arm/mach-s5pv210/mach-goni.c                  |   13 +
 drivers/misc/Kconfig                               |    8 +
 drivers/misc/Makefile                              |    1 +
 drivers/misc/cma-dev.c                             |  184 +++
 include/linux/cma.h                                |  475 ++++++++
 include/linux/rbtree.h                             |   11 +
 mm/Kconfig                                         |   34 +
 mm/Makefile                                        |    3 +
 mm/cma-best-fit.c                                  |  407 +++++++
 mm/cma.c                                           | 1170 ++++++++++++++++++++
 tools/cma/cma-test.c                               |  373 +++++++
 16 files changed, 3353 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous
 create mode 100644 Documentation/contiguous-memory.txt
 create mode 100644 drivers/misc/cma-dev.c
 create mode 100644 include/linux/cma.h
 create mode 100644 mm/cma-best-fit.c
 create mode 100644 mm/cma.c
 create mode 100644 tools/cma/cma-test.c

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCHv2 1/4] lib: rbtree: rb_root_init() function added
@ 2010-07-26 14:11   ` Michal Nazarewicz
  0 siblings, 0 replies; 57+ messages in thread
From: Michal Nazarewicz @ 2010-07-26 14:40 UTC (permalink / raw)
  To: linux-arm-kernel

Added a rb_root_init() function which initialises a rb_root
structure as a red-black tree with at most one element.  The
rationale is that using rb_root_init(root, node) is more
straightforward and cleaner then first initialising and
empty tree followed by an insert operation.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 include/linux/rbtree.h |   11 +++++++++++
 1 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index 7066acb..5b6dc66 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -130,6 +130,17 @@ static inline void rb_set_color(struct rb_node *rb, int color)
 }
 
 #define RB_ROOT	(struct rb_root) { NULL, }
+
+static inline void rb_root_init(struct rb_root *root, struct rb_node *node)
+{
+	root->rb_node = node;
+	if (node) {
+		node->rb_parent_color = RB_BLACK; /* black, no parent */
+		node->rb_left  = NULL;
+		node->rb_right = NULL;
+	}
+}
+
 #define	rb_entry(ptr, type, member) container_of(ptr, type, member)
 
 #define RB_EMPTY_ROOT(root)	((root)->rb_node == NULL)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-26 14:11     ` Michal Nazarewicz
  0 siblings, 0 replies; 57+ messages in thread
From: Michal Nazarewicz @ 2010-07-26 14:40 UTC (permalink / raw)
  To: linux-arm-kernel

The Contiguous Memory Allocator framework is a set of APIs for
allocating physically contiguous chunks of memory.

Various chips require contiguous blocks of memory to operate.  Those
chips include devices such as cameras, hardware video decoders and
encoders, etc.

The code is highly modular and customisable to suit the needs of
various users.  Set of regions reserved for CMA can be configured on
run-time and it is easy to add custom allocator algorithms if one
has such need.

For more details see Documentation/contiguous-memory.txt.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Reviewed-by: Pawel Osciak <p.osciak@samsung.com>
---
 Documentation/00-INDEX                             |    2 +
 .../ABI/testing/sysfs-kernel-mm-contiguous         |    9 +
 Documentation/contiguous-memory.txt                |  646 +++++++++++
 Documentation/kernel-parameters.txt                |    4 +
 include/linux/cma.h                                |  445 ++++++++
 mm/Kconfig                                         |   34 +
 mm/Makefile                                        |    3 +
 mm/cma-best-fit.c                                  |  407 +++++++
 mm/cma.c                                           | 1170 ++++++++++++++++++++
 9 files changed, 2720 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous
 create mode 100644 Documentation/contiguous-memory.txt
 create mode 100644 include/linux/cma.h
 create mode 100644 mm/cma-best-fit.c
 create mode 100644 mm/cma.c

diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
index 5405f7a..bb50209 100644
--- a/Documentation/00-INDEX
+++ b/Documentation/00-INDEX
@@ -94,6 +94,8 @@ connector/
 	- docs on the netlink based userspace<->kernel space communication mod.
 console/
 	- documentation on Linux console drivers.
+contiguous-memory.txt
+	- documentation on physically-contiguous memory allocation framework.
 cpu-freq/
 	- info on CPU frequency and voltage scaling.
 cpu-hotplug.txt
diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-contiguous b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
new file mode 100644
index 0000000..05e2f6a
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
@@ -0,0 +1,9 @@
+What:		/sys/kernel/mm/contiguous/
+Date:		July 2008
+Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
+Description:
+		/sys/kernel/mm/contiguous/ contains two files: asterisk and
+		map.  They are used to configure the Contiguous Memory
+		Allocator framework.
+
+		For details see Documentation/contiguous-memory.txt.
diff --git a/Documentation/contiguous-memory.txt b/Documentation/contiguous-memory.txt
new file mode 100644
index 0000000..6eb1295
--- /dev/null
+++ b/Documentation/contiguous-memory.txt
@@ -0,0 +1,646 @@
+                                                             -*- org -*-
+
+* Contiguous Memory Allocator
+
+   The Contiguous Memory Allocator (CMA) is a framework, which allows
+   setting up a machine-specific configuration for physically-contiguous
+   memory management. Memory for devices is then allocated according
+   to that configuration.
+
+   The main role of the framework is not to allocate memory, but to
+   parse and manage memory configurations, as well as to act as an
+   in-between between device drivers and pluggable allocators. It is
+   thus not tied to any memory allocation method or strategy.
+
+** Why is it needed?
+
+    Various devices on embedded systems have no scatter-getter and/or
+    IO map support and as such require contiguous blocks of memory to
+    operate.  They include devices such as cameras, hardware video
+    decoders and encoders, etc.
+
+    Such devices often require big memory buffers (a full HD frame is,
+    for instance, more then 2 mega pixels large, i.e. more than 6 MB
+    of memory), which makes mechanisms such as kmalloc() ineffective.
+
+    Some embedded devices impose additional requirements on the
+    buffers, e.g. they can operate only on buffers allocated in
+    particular location/memory bank (if system has more than one
+    memory bank) or buffers aligned to a particular memory boundary.
+
+    Development of embedded devices have seen a big rise recently
+    (especially in the V4L area) and many such drivers include their
+    own memory allocation code. Most of them use bootmem-based methods.
+    CMA framework is an attempt to unify contiguous memory allocation
+    mechanisms and provide a simple API for device drivers, while
+    staying as customisable and modular as possible.
+
+** Design
+
+    The main design goal for the CMA was to provide a customisable and
+    modular framework, which could be configured to suit the needs of
+    individual systems.  Configuration specifies a list of memory
+    regions, which then are assigned to devices.  Memory regions can
+    be shared among many device drivers or assigned exclusively to
+    one.  This has been achieved in the following ways:
+
+    1. The core of the CMA does not handle allocation of memory and
+       management of free space.  Dedicated allocators are used for
+       that purpose.
+
+       This way, if the provided solution does not match demands
+       imposed on a given system, one can develop a new algorithm and
+       easily plug it into the CMA framework.
+
+       The presented solution includes an implementation of a best-fit
+       algorithm.
+
+    2. CMA allows a run-time configuration of the memory regions it
+       will use to allocate chunks of memory from.  The set of memory
+       regions is given on command line so it can be easily changed
+       without the need for recompiling the kernel.
+
+       Each region has it's own size, alignment demand, a start
+       address (physical address where it should be placed) and an
+       allocator algorithm assigned to the region.
+
+       This means that there can be different algorithms running at
+       the same time, if different devices on the platform have
+       distinct memory usage characteristics and different algorithm
+       match those the best way.
+
+    3. When requesting memory, devices have to introduce themselves.
+       This way CMA knows who the memory is allocated for.  This
+       allows for the system architect to specify which memory regions
+       each device should use.
+
+       3a. Devices can also specify a "kind" of memory they want.
+           This makes it possible to configure the system in such
+           a way, that a single device may get memory from different
+           memory regions, depending on the "kind" of memory it
+           requested.  For example, a video codec driver might want to
+           allocate some shared buffers from the first memory bank and
+           the other from the second to get the highest possible
+           memory throughput.
+
+    4. For greater flexibility and extensibility, the framework allows
+       device drivers to register private regions of reserved memory
+       which then may be used only by them.
+
+       As an effect, if a driver would not use the rest of the CMA
+       interface, it can still use CMA allocators and other
+       mechanisms.
+
+       4a. Early in boot process, device drivers can also request the
+           CMA framework to a reserve a region of memory for them
+           which then will be used as a private region.
+
+           This way, drivers do not need to directly call bootmem,
+           memblock or similar early allocator but merely register an
+           early region and the framework will handle the rest
+           including choosing the right early allocator.
+
+** Use cases
+
+    Lets analyse some imaginary system that uses the CMA to see how
+    the framework can be used and configured.
+
+
+    We have a platform with a hardware video decoder and a camera each
+    needing 20 MiB of memory in worst case.  Our system is written in
+    such a way though that the two devices are never used at the same
+    time and memory for them may be shared.  In such a system the
+    following two configuration strings would be used:
+
+        static char regions[]  __initdata = "region=20M";
+        static char map[]      __initdata = "video,camera=region";
+
+        cma_set_defaults(regions, map, NULL);
+
+    .regions instructs CMA to allocate a region of 20 MiB and use the
+    first available memory allocator on it.  .map says that drivers
+    named "video" and "camera" are to be granted memory from the
+    previously defined region.
+
+    This can in fact be written in simpler way:
+
+        static       char regions[]  __initdata = "region=20M";
+
+        cma_set_defaults(regions, NULL, NULL);
+
+    By omitting the .map attribute, we say that all drivers are to use
+    all the regions defined with the first argument.  Generally, if
+    a device/kind pair is not matched by any rule from the .map it is
+    granted access to all regions from the .regions attribute.
+
+    We can see, that because the devices share the same region of
+    memory, we save 20 MiB of memory, compared to the situation when
+    each of the devices would reserve 20 MiB of memory for itself.
+
+
+    Now, let say that we have also many other smaller devices and we
+    want them to share some smaller pool of memory.  For instance 5
+    MiB.  This can be achieved in several ways:
+
+        static char regions[]  __initdata = "region=20M;common=5M";
+        static char map[]      __initdata =
+                "video,camera=region;*/*=common";
+
+        cma_set_defaults(regions, map, NULL);
+
+    This instructs CMA to reserve two regions and let video and camera
+    use region "region" whereas all other devices should use region
+    "common".  Alternatively, we might say:
+
+        static char regions[]  __initdata = "region=20M;common=5M";
+        static char map[]      __initdata = "video,camera=region";
+        static char asterisk[] __initdata = "common";
+
+        cma_set_defaults(regions, map, asterisk);
+
+    It works in similar fashion, expect by specifying the .asterisk
+    attribute, we say that devices not matched by any rule from .map
+    should access regions listed in .asterisk attribute.
+
+
+    Later on, after some development of the system, it can now run
+    video decoder and camera at the same time.  The 20 MiB region is
+    no longer enough for the two to share.  A quick fix can be made to
+    grant each of those devices separate regions:
+
+        static char regions[]  __initdata = "v=20M;c=20M;common=5M";
+        static char map[]      __initdata = "video=v;camera=c";
+        static char asterisk[] __initdata = "common";
+
+        cma_set_defaults(regions, map, asterisk);
+
+    This solution also shows how with CMA you can assign private pools
+    of memory to each device if that is required.
+
+    Allocation mechanisms can be replaced dynamically in a similar
+    manner as well. Let's say that during testing, it has been
+    discovered that, for a given shared region of 40 MiB,
+    fragmentation has become a problem.  It has been observed that,
+    after some time, it becomes impossible to allocate buffers of the
+    required sizes. So to satisfy our requirements, we would have to
+    reserve a larger shared region beforehand.
+
+    But fortunately, you have also managed to develop a new allocation
+    algorithm -- Neat Allocation Algorithm or "na" for short -- which
+    satisfies the needs for both devices even on a 30 MiB region.  The
+    configuration can be then quickly changed to:
+
+        static char regions[]  __initdata = "region=30M:na;common=5M";
+        static char map[]      __initdata = "video,camera=region";
+        static char asterisk[] __initdata = "common";
+
+        cma_set_defaults(regions, map, asterisk);
+
+    This shows how you can develop your own allocation algorithms if
+    the ones provided with CMA do not suit your needs and easily
+    replace them, without the need to modify CMA core or even
+    recompiling the kernel.
+
+** Technical Details
+
+*** The attributes
+
+    As shown above, CMA is configured by a set of attributes: regions,
+    map and asterisk.  The first one specifies regions that are to be
+    reserved for CMA.  The second one specifies what regions each
+    device is assigned to.  The third one specifies what regions
+    devices without rule in map should have access to.
+
+**** Regions
+
+     The format of the "regions" attribute is as follows:
+
+         regions-attr  ::= [ regions [ ';' ] ]
+         regions       ::= region [ ';' regions ]
+
+         region        ::= [ '-' ] REG-NAME
+                             '=' size
+                           [ '@' start ]
+                           [ '/' alignment ]
+                           [ ':' ALLOC-NAME ]
+
+         size          ::= MEMSIZE   // size of the region
+         start         ::= MEMSIZE   // desired start address of
+                                     // the region
+         alignment     ::= MEMSIZE   // alignment of the start
+                                     // address of the region
+
+     REG-NAME specifies the name of the region.  All regions given at
+     via the regions attribute need to have a name.  Moreover, all
+     regions need to have a unique name.  If two regions have the same
+     name it is unspecified which will be used when requesting to
+     allocate memory from region with given name.
+
+     ALLOC-NAME specifies the name of allocator to be used with the
+     region.  If no allocator name is provided, the "default"
+     allocator will be used with the region.  The "default" allocator
+     is, of course, the first allocator that has been registered. ;)
+
+     size, start and alignment are specified in bytes with suffixes
+     that memparse() accept.  If start is given, the region will be
+     reserved on given starting address (or at close to it as
+     possible).  If alignment is specified, the region will be aligned
+     to given value.
+
+     This can be also specified by the command line argument "cma".
+     If it is, platform's defaults won't be used.
+
+**** Map
+
+     The format of the "map" attribute is as follows:
+
+         map-attr      ::= [ rules [ ';' ] ]
+         rules         ::= rule [ ';' rules ]
+         rule          ::= patterns '=' regions
+
+         patterns      ::= pattern [ ',' patterns ]
+
+         regions       ::= REG-NAME [ ',' regions ] | '*'
+                       // list of regions to try to allocate memory
+                       // from
+
+         pattern       ::= dev-pattern [ '/' kind-pattern ]
+                       | '/' kind-pattern
+                       // pattern request must match for the rule to
+                       // apply; the first rule that matches is
+                       // applied; if dev-pattern part is omitted
+                       // value identical to the one used in previous
+                       // pattern is assumed
+
+         dev-pattern   ::= PATTERN-STR
+                       // pattern that device name must match for the
+                       // rule to apply.
+         kind-pattern  ::= PATTERN-STR
+                       // pattern that "kind" of memory (provided by
+                       // device) must match for the rule to apply.
+
+     The "map" attribute is a sequence of rules which specify what
+     regions should given (device, kind) pair use.  The first rule
+     that matches is applied.
+
+     For rule to match, the pattern must match (dev, kind) pair.
+     Pattern consist of the part before and after slash.  The first
+     part must match device name and the second part must match kind.
+
+     If the first part is empty, the device name is assumed to match
+     iff it matched in previous pattern.
+
+     Not specifying the second part matches only empty, or
+     not-specified kind.
+
+     Patterns may contain question marks which mach any characters and
+     end with an asterisk which match the rest of the string
+     (including nothing).
+
+     The '*' as the list of regions means to use all the regions
+     listed in the "asterisk" attribute.  Moreover, if no rule matches
+     a device the list of regions from the "asterisk" attribute is
+     used as well (ie. a "*/*=*" rule is assumed at the end).
+
+     This attribute is accessible via SysFS and can be changed at
+     run-time by writing to /sys/kernel/mm/contiguous/map.
+
+**** Asterisk
+
+     The format of the "asterisk" attribute is as follows:
+
+         asterisk-attr ::= [ regions ]
+         regions       ::= REG-NAME [ ',' regions ]
+
+     If no rule from "map" attribute match given driver, the regions
+     specified by the "asterisk" attribute are used.
+
+     If this attribute is not specified by call to cma_set_defaults()
+     it is built automatically from the regions listed in the
+     "regions" rule.  All regions are included expect for the ones
+     which name is prefixed by a minus sign.
+
+     For instance, the following:
+
+         static char regions[]  __initdata = "region=20M;common=5M";
+         static char map[]      __initdata = "video,camera=region";
+         static char asterisk[] __initdata = "common";
+
+     can be replaced by:
+
+         static char regions[]  __initdata = "-region=20M;common=5M";
+         static char map[]      __initdata = "video,camera=region";
+
+     The biggest difference is that if user specifies the "cma"
+     command line parameter and platform did not specifies the
+     "asterisk" attribute it will be built from the user-supplied
+     "cma" parameter.  If "asterisk" attribute were to be specified by
+     platform code, it would not be created from command line
+     parameter.
+
+     This attribute is accessible via SysFS and can be changed at
+     run-time by writing to /sys/kernel/mm/contiguous/asterisk.
+
+**** Examples
+
+     Some examples (whitespace added for better readability):
+
+         cma = r1 = 64M       // 64M region
+                    @512M       // starting at address 512M
+                                // (or at least as near as possible)
+                    /1M         // make sure it's aligned to 1M
+                    :foo(bar);  // uses allocator "foo" with "bar"
+                                // as parameters for it
+               r2 = 64M       // 64M region
+                    /1M;        // make sure it's aligned to 1M
+                                // uses the first available allocator
+               r3 = 64M       // 64M region
+                    @512M       // starting at address 512M
+                    :foo;       // uses allocator "foo" with no parameters
+
+         cma_map = foo = r1;
+                       // device foo with kind==NULL uses region r1
+
+                   foo/quaz = r2;  // OR:
+                   /quaz = r2;
+                       // device foo with kind == "quaz" uses region r2
+
+                   foo/* = r3;     // OR:
+                   /* = r3;
+                       // device foo with any other kind uses region r3
+
+                   bar/* = r1,r2;
+                       // device bar with any kind uses region r1 or r2
+
+                   baz?/a* , baz?/b* = r3;
+                       // devices named baz? where ? is any character
+                       // with kind being a string starting with "a" or
+                       // "b" use r3
+
+*** The device and kind of memory
+
+    The name of the device is taken form the device structure.  It is
+    not possible to use CMA if driver does not register a device
+    (actually this can be overcome if a fake device structure is
+    provided with at least the name set).
+
+    The kind of memory is an optional argument provided by the device
+    whenever it requests memory chunk.  In many cases this can be
+    ignored but sometimes it may be required for some devices.
+
+    For instance, let say that there are two memory banks and for
+    performance reasons a device uses buffers in both of them.  In
+    such case, the device driver would define two kinds and use it for
+    different buffers.  CMA attributes could look as follows:
+
+         static char regions[]  __initdata = "a=32M at 0;b=32M at 512M";
+         static char map[]      __initdata = "foo/a=a;foo/b=b";
+
+    And whenever the driver allocated the memory it would specify the
+    kind of memory:
+
+        buffer1 = cma_alloc(dev, "a", 1 << 20, 0);
+        buffer2 = cma_alloc(dev, "b", 1 << 20, 0);
+
+    If it was needed to try to allocate from the other bank as well if
+    the dedicated one is full, the attributes could be changed to:
+
+         static char regions[]  __initdata = "a=32M at 0;b=32M at 512M";
+         static char map[]      __initdata = "foo/a=a,b;foo/b=b,a";
+
+    On the other hand, if the same driver was used on a system with
+    only one bank, the command line could be changed to:
+
+         static char regions[]  __initdata = "r=64";
+         static char map[]      __initdata = "foo/*=r";
+
+    without the need to change the driver at all.
+
+*** Device API
+
+    There are four calls provided by the CMA framework to devices.  To
+    allocate a chunk of memory cma_alloc() function needs to be used:
+
+        dma_addr_t cma_alloc(const struct device *dev, const char *kind,
+                             size_t size, dma_addr_t alignment);
+
+    If required, device may specify alignment in bytes that the chunk
+    need to satisfy.  It have to be a power of two or zero.  The
+    chunks are always aligned@least to a page.
+
+    The kind specifies the kind of memory as described to in the
+    previous subsection.  If device driver does not use notion of
+    memory kinds it's safe to pass NULL as the kind.
+
+    The basic usage of the function is just a:
+
+        addr = cma_alloc(dev, NULL, size, 0);
+
+    The function returns physical address of allocated chunk or
+    a value that evaluated true if checked with IS_ERR_VALUE(), so the
+    correct way for checking for errors is:
+
+        unsigned long addr = cma_alloc(dev, size);
+        if (IS_ERR_VALUE(addr))
+                return (int)addr;
+        /* Allocated */
+
+    (Make sure to include <linux/err.h> which contains the definition
+    of the IS_ERR_VALUE() macro.)
+
+
+    Allocated chunk is freed via a cma_put() function:
+
+        int cma_put(dma_addr_t addr);
+
+    It takes physical address of the chunk as an argument and
+    decreases it's reference counter.  If the counter reaches zero the
+    chunk is freed.  Most of the time users do not need to think about
+    reference counter and simply use the cma_put() as a free call.
+
+    If one, however, were to share a chunk with others built in
+    reference counter may turn out to be handy.  To increment it, one
+    needs to use cma_get() function:
+
+        int cma_get(dma_addr_t addr);
+
+
+    The last function is the cma_info() which returns information
+    about regions assigned to given (dev, kind) pair.  Its syntax is:
+
+        int cma_info(struct cma_info *info,
+                     const struct device *dev,
+                     const char *kind);
+
+    On successful exit it fills the info structure with lower and
+    upper bound of regions, total size and number of regions assigned
+    to given (dev, kind) pair.
+
+**** Dynamic and private regions
+
+     In the basic setup, regions are provided via command line argument
+     and then allocated by platform initialisation code (which usually
+     calls cma_regions_allocate() function).
+
+     It is, however, possible to create and add regions dynamically
+     using cma_region_register() function.
+
+         int cma_region_register(struct cma_region *reg);
+
+     The region does not have to have name.  If it does not, it won't
+     be accessed via standard mapping (the one provided with cma_map
+     parameter).  Such regions are private and to allocate chunk on
+     them, one needs to call:
+
+         dma_addr_t cma_alloc_from_region(struct cma_region *reg,
+                                          size_t size, dma_addr_t alignment);
+
+     It is just like cma_alloc() expect one specifies what region to
+     allocate memory from.  The region must have been registered.
+
+**** Allocating from region specified by name
+
+     If a driver preferred allocating from a region or list of regions
+     it knows name of it can use a different call simmilar to the
+     previous:
+
+         dma_addr_t cma_alloc_from(const char *regions,
+                                   size_t size, dma_addr_t alignment);
+
+     The first argument is a comma-separated list of regions the
+     driver desires CMA to try and allocate from.  The list is
+     terminated by NUL byte or a semicolon.
+
+     Similarly, there is a call for requesting information about named
+     regions:
+
+        int cma_info_about(struct cma_info *info, const char *regions);
+
+     Generally, it should not be needed to use those interfaces but
+     they are provided nevertheless.
+
+**** Registering early regions
+
+     An early region is a region that is managed by CMA early during
+     boot process.  It's platforms responsibility to reserve memory
+     for early regions.  Later on, when CMA initialises early regions
+     with reserved memory are registered as normal regions.
+     Registering an early region may be a way for a device to request
+     a private pool of memory without worrying about actually
+     reserving the memory:
+
+         int cma_early_region_register(struct cma_region *reg);
+
+     This needs to be done quite early on in boot process, before
+     platform traverses the cma_early_regions list to reserve memory.
+
+     When boot process ends, device driver may see whether the region
+     was reserved (by checking reg->reserved flag) and if so, whether
+     it was successfully registered as a normal region (by checking
+     the reg->registered flag).  If that is the case, device driver
+     can use normal API calls to use the region.
+
+*** Allocator operations
+
+    Creating an allocator for CMA needs four functions to be
+    implemented.
+
+
+    The first two are used to initialise an allocator far given driver
+    and clean up afterwards:
+
+        int  cma_foo_init(struct cma_region *reg);
+        void cma_foo_done(struct cma_region *reg);
+
+    The first is called when allocater is attached to region.  The
+    cma_region structure has saved starting address of the region as
+    well as its size.  Any data that allocate associated with the
+    region can be saved in private_data field.
+
+    The second call cleans up and frees all resources the allocator
+    has allocated for the region.  The function can assume that all
+    chunks allocated form this region have been freed thus the whole
+    region is free.
+
+
+    The two other calls are used for allocating and freeing chunks.
+    They are:
+
+        struct cma_chunk *cma_foo_alloc(struct cma_region *reg,
+                                        size_t size, dma_addr_t alignment);
+        void cma_foo_free(struct cma_chunk *chunk);
+
+    As names imply the first allocates a chunk and the other frees
+    a chunk of memory.  It also manages a cma_chunk object
+    representing the chunk in physical memory.
+
+    Either of those function can assume that they are the only thread
+    accessing the region.  Therefore, allocator does not need to worry
+    about concurrency.  Moreover, all arguments are guaranteed to be
+    valid (i.e. page aligned size, a power of two alignment no lower
+    the a page size).
+
+
+    When allocator is ready, all that is left is to register it by
+    calling cma_allocator_register() function:
+
+            int cma_allocator_register(struct cma_allocator *alloc);
+
+    The argument is an structure with pointers to the above functions
+    and allocator's name.  The whole call may look something like
+    this:
+
+        static struct cma_allocator alloc = {
+                .name    = "foo",
+                .init    = cma_foo_init,
+                .cleanup = cma_foo_cleanup,
+                .alloc   = cma_foo_alloc,
+                .free    = cma_foo_free,
+        };
+        return cma_allocator_register(&alloc);
+
+    The name ("foo") will be available to use with command line
+    argument.
+
+*** Integration with platform
+
+    There is one function that needs to be called form platform
+    initialisation code.  That is the cma_early_regions_reserve()
+    function:
+
+        void cma_early_regions_reserve(int (*reserve)(struct cma_region *reg));
+
+    It traverses list of all of the regions given on command line and
+    reserves memory for them.  The only argument is a callback
+    function used to reserve the region.  Passing NULL as the argument
+    makes the function use cma_early_region_reserve() function which
+    uses bootmem and memblock for allocating.
+
+    Alternatively, platform code could traverse the cma_early_regions
+    list by itself but this should not be necessary.
+
+
+    Platform has also a way of providing default attributes for CMA,
+    cma_set_defaults() function is used for that purpose:
+
+        int __init cma_set_defaults(char *regions, const char *map,
+                                    const char *asterisk);
+
+    It needs to be called after early params have been parsed but
+    prior to reserving regions.  It let one specify the three
+    attributes used to configure CMA.  Any of the pointer may point to
+    a string in __initdata section.  The regions argument must be
+    modifyable though.
+
+** Future work
+
+    In the future, implementation of mechanisms that would allow the
+    free space inside the regions to be used as page cache, filesystem
+    buffers or swap devices is planned.  With such mechanisms, the
+    memory would not be wasted when not used.
+
+    Because all allocations and freeing of chunks pass the CMA
+    framework it can follow what parts of the reserved memory are
+    freed and what parts are allocated.  Tracking the unused memory
+    would let CMA use it for other purposes such as page cache, I/O
+    buffers, swap, etc.
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index a698255..5c7af89 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -43,6 +43,7 @@ parameter is applicable:
 	AVR32	AVR32 architecture is enabled.
 	AX25	Appropriate AX.25 support is enabled.
 	BLACKFIN Blackfin architecture is enabled.
+	CMA	Contiguous Memory Allocator is enabled.
 	DRM	Direct Rendering Management support is enabled.
 	EDD	BIOS Enhanced Disk Drive Services (EDD) is enabled
 	EFI	EFI Partitioning (GPT) is enabled
@@ -476,6 +477,9 @@ and is between 256 and 4096 characters. It is defined in the file
 			Also note the kernel might malfunction if you disable
 			some critical bits.
 
+	cma=		[CMA] List of CMA regions.
+			See Documentation/contiguous-memory.txt for details.
+
 	cmo_free_hint=	[PPC] Format: { yes | no }
 			Specify whether pages are marked as being inactive
 			when they are freed.  This is used in CMO environments
diff --git a/include/linux/cma.h b/include/linux/cma.h
new file mode 100644
index 0000000..96a8652
--- /dev/null
+++ b/include/linux/cma.h
@@ -0,0 +1,445 @@
+#ifndef __LINUX_CMA_H
+#define __LINUX_CMA_H
+
+/*
+ * Contiguous Memory Allocator framework
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz at samsung.com)
+ */
+
+/*
+ * See Documentation/contiguous-memory.txt for details.
+ */
+
+/***************************** Kernel lever API *****************************/
+
+#ifdef __KERNEL__
+
+#include <linux/kref.h>
+#include <linux/rbtree.h>
+#include <linux/list.h>
+
+
+struct device;
+struct cma_info;
+
+/*
+ * Don't call it directly, use cma_alloc(), cma_alloc_from() or
+ * cma_alloc_from_region().
+ */
+dma_addr_t __must_check
+__cma_alloc(const struct device *dev, const char *kind,
+	    size_t size, dma_addr_t alignment);
+
+/* Don't call it directly, use cma_info() or cma_info_about(). */
+int
+__cma_info(struct cma_info *info, const struct device *dev, const char *kind);
+
+
+/**
+ * cma_alloc - allocates contiguous chunk of memory.
+ * @dev:	The device to perform allocation for.
+ * @kind:	A kind of memory to allocate.  A device may use several
+ * 		different kinds of memory which are configured
+ * 		separately.  Usually it's safe to pass NULL here.
+ * @size:	Size of the memory to allocate in bytes.
+ * @alignment:	Desired alignment in bytes.  Must be a power of two or
+ * 		zero.  If alignment is less then a page size it will be
+ * 		set to page size. If unsure, pass zero here.
+ *
+ * On error returns a negative error cast to dma_addr_t.  Use
+ * IS_ERR_VALUE() to check if returned value is indeed an error.
+ * Otherwise physical address of the chunk is returned.
+ */
+static inline dma_addr_t __must_check
+cma_alloc(const struct device *dev, const char *kind,
+	  size_t size, dma_addr_t alignment)
+{
+	return dev ? -EINVAL : __cma_alloc(dev, kind, size, alignment);
+}
+
+
+/**
+ * struct cma_info - information about regions returned by cma_info().
+ * @lower_bound:	The smallest address that is possible to be
+ * 			allocated for given (dev, kind) pair.
+ * @upper_bound:	The one byte after the biggest address that is
+ * 			possible to be allocated for given (dev, kind)
+ * 			pair.
+ * @total_size:	Total size of regions mapped to (dev, kind) pair.
+ * @free_size:	Total free size in all of the regions mapped to (dev, kind)
+ * 		pair.  Because of possible race conditions, it is not
+ * 		guaranteed that the value will be correct -- it gives only
+ * 		an approximation.
+ * @count:	Number of regions mapped to (dev, kind) pair.
+ */
+struct cma_info {
+	dma_addr_t lower_bound, upper_bound;
+	size_t total_size, free_size;
+	unsigned count;
+};
+
+/**
+ * cma_info - queries information about regions.
+ * @info:	Pointer to a structure where to save the information.
+ * @dev:	The device to query information for.
+ * @kind:	A kind of memory to query information for.
+ * 		If unsure, pass NULL here.
+ *
+ * On error returns a negative error, zero otherwise.
+ */
+static inline int
+cma_info(struct cma_info *info, const struct device *dev, const char *kind)
+{
+	return dev ? -EINVAL : __cma_info(info, dev, kind);
+}
+
+
+/**
+ * cma_get - increases reference counter of a chunk.
+ * @addr:	Beginning of the chunk.
+ *
+ * Returns zero on success or -ENOENT if there is no chunk at given
+ * location.  In the latter case issues a warning and a stacktrace.
+ */
+int cma_get(dma_addr_t addr);
+
+/**
+ * cma_put - decreases reference counter of a chunk.
+ * @addr:	Beginning of the chunk.
+ *
+ * Returns one if the chunk has been freed, zero if it hasn't, and
+ * -ENOENT if there is no chunk at given location.  In the latter case
+ * issues a warning and a stacktrace.
+ *
+ * If this function returns zero, you still can not count on the area
+ * remaining in memory.  Only use the return value if you want to see
+ * if the area is now gone, not present.
+ */
+int cma_put(dma_addr_t addr);
+
+
+
+/****************************** Lower lever API *****************************/
+
+/**
+ * cma_alloc_from - allocates contiguous chunk of memory from named regions.
+ * @regions:	Comma separated list of region names.  Terminated by NUL
+ * 		byte or a semicolon.  "*" or NULL means to try all regions
+ * 		which are listed as asterisk regions.
+ * @size:	Size of the memory to allocate in bytes.
+ * @alignment:	Desired alignment in bytes.  Must be a power of two or
+ * 		zero.  If alignment is less then a page size it will be
+ * 		set to page size. If unsure, pass zero here.
+ *
+ * On error returns a negative error cast to dma_addr_t.  Use
+ * IS_ERR_VALUE() to check if returned value is indeed an error.
+ * Otherwise physical address of the chunk is returned.
+ */
+static inline dma_addr_t __must_check
+cma_alloc_from(const char *regions, size_t size, dma_addr_t alignment)
+{
+	return __cma_alloc(NULL, regions, size, alignment);
+}
+
+/**
+ * cma_info_about - queries information about named regions.
+ * @info:	Pointer to a structure where to save the information.
+ * @regions:	Comma separated list of region names.  Terminated by NUL
+ * 		byte or a semicolon.
+ *
+ * On error returns a negative error, zero otherwise.
+ */
+static inline int
+cma_info_about(struct cma_info *info, const const char *regions)
+{
+	return __cma_info(info, NULL, regions);
+}
+
+
+
+struct cma_allocator;
+
+/**
+ * struct cma_region - a region reserved for CMA allocations.
+ * @name:	Unique name of the region.  Read only.
+ * @start:	Physical starting address of the region in bytes.  Always
+ * 		aligned at least to a full page.  Read only.
+ * @size:	Size of the region in bytes.  Multiply of a page size.
+ * 		Read only.
+ * @free_space:	Free space in the region.  Read only.
+ * @alignment:	Desired alignment of the region in bytes.  A power of two,
+ * 		always at least page size.  Early.
+ * @alloc:	Allocator used with this region.  NULL means allocator is
+ * 		not attached.  Private.
+ * @alloc_name:	Allocator name read from cmdline.  Private.  This may be
+ * 		different from @alloc->name.
+ * @private_data:	Allocator's private data.
+ * @used:	Whether region was already used, ie. there was at least
+ * 		one allocation request for.  Private.
+ * @list:	Entry in list of regions.  Private.
+ * @registered:	Whenthe this region has been registered.  Read only.
+ * @reserved:	Whether this region has been reserved.  Early.  Read only.
+ * @copy_name:	Whether @name and @alloc_name needs to be copied when
+ * 		this region is converted from early to normal.  Early.
+ * 		Private.
+ *
+ * Regions come in two types: an early region and normal region.  The
+ * former can be reserved or not-reserved.  Fields marked as "early"
+ * are only meaningful in early regions.
+ *
+ * Early regions are important only during initialisation.  The list
+ * of early regions is built from the "cma" command line argument or
+ * platform defaults.  Platform initialisation code is responsible for
+ * reserving space for unreserved regions that are placed on
+ * cma_early_regions list.
+ *
+ * Later, during CMA initialisation all reserved regions from the
+ * cma_early_regions list are registered as normal regions and can be
+ * used using standard mechanisms.
+ */
+ /* * @users:	Number of chunks allocated in this region. */
+struct cma_region {
+	const char *name;
+	dma_addr_t start;
+	size_t size;
+	union {
+		size_t free_space;	/* Normal region */
+		dma_addr_t alignment;	/* Early region */
+	};
+
+	struct cma_allocator *alloc;
+	const char *alloc_name;
+	union {
+		void *private_data;	/* Normal region w/ allocator */
+		unsigned used;		/* Normal regien w/o allocator */
+	};
+
+	/* unsigned users; */
+	struct list_head list;
+
+	unsigned registered:1;
+	unsigned reserved:1;
+	unsigned copy_name:1;
+};
+
+
+/**
+ * cma_region_register() - registers a region.
+ * @reg:	Region to add.
+ *
+ * Region's start and size must be set.
+ *
+ * If name is set the region will be accessible using normal mechanism
+ * like mapping or cma_alloc_from() function otherwise it will be
+ * a private region and accessible only using the
+ * cma_alloc_from_region() function.
+ *
+ * If alloc is set function will try to initialise given allocator
+ * (and will return error if it failes).  Otherwise alloc_name may
+ * point to a name of an allocator to use (if not set, the default
+ * will be used).
+ *
+ * All other fields are ignored and/or overwritten.
+ *
+ * Returns zero or negative error.  In particular, -EADDRINUSE if
+ * region overlap with already existing region.
+ */
+int __must_check cma_region_register(struct cma_region *reg);
+
+
+/**
+ * cma_alloc_from_region() - allocates contiguous chunk of memory from region.
+ * @reg:	Region to allocate chunk from.
+ * @size:	Size of the memory to allocate in bytes.
+ * @alignment:	Desired alignment in bytes.  Must be a power of two or
+ * 		zero.  If alignment is less then a page size it will be
+ * 		set to page size. If unsure, pass zero here.
+ *
+ * On error returns a negative error cast to dma_addr_t.  Use
+ * IS_ERR_VALUE() to check if returned value is indeed an error.
+ * Otherwise physical address of the chunk is returned.
+ */
+dma_addr_t __must_check
+cma_alloc_from_region(struct cma_region *reg,
+		      size_t size, dma_addr_t alignment);
+
+
+
+/****************************** Allocators API ******************************/
+
+/**
+ * struct cma_chunk - an allocated contiguous chunk of memory.
+ * @start:	Physical address in bytes.
+ * @size:	Size in bytes.
+ * @free_space:	Free space in region in bytes.  Read only.
+ * @reg:	Region this chunk belongs to.
+ * @kref:	Number of references.  Private.
+ * @by_start:	A node in an red-black tree with all chunks sorted by
+ * 		start address.
+ *
+ * The cma_allocator::alloc() operation need to set only the @start
+ * and @size fields.  The rest is handled by the caller (ie. CMA
+ * glue).
+ */
+struct cma_chunk {
+	dma_addr_t start;
+	size_t size;
+
+	struct cma_region *reg;
+	struct kref ref;
+	struct rb_node by_start;
+};
+
+
+/**
+ * struct cma_allocator - a CMA allocator.
+ * @name:	Allocator's unique name
+ * @init:	Initialises an allocator on given region.
+ * @cleanup:	Cleans up after init.  May assume that there are no chunks
+ * 		allocated in given region.
+ * @alloc:	Allocates a chunk of memory of given size in bytes and
+ * 		with given alignment.  Alignment is a power of
+ * 		two (thus non-zero) and callback does not need to check it.
+ * 		May also assume that it is the only call that uses given
+ * 		region (ie. access to the region is synchronised with
+ * 		a mutex).  This has to allocate the chunk object (it may be
+ * 		contained in a bigger structure with allocator-specific data.
+ * 		Required.
+ * @free:	Frees allocated chunk.  May also assume that it is the only
+ * 		call that uses given region.  This has to free() the chunk
+ * 		object as well.  Required.
+ * @list:	Entry in list of allocators.  Private.
+ */
+ /* * @users:	How many regions use this allocator.  Private. */
+struct cma_allocator {
+	const char *name;
+
+	int (*init)(struct cma_region *reg);
+	void (*cleanup)(struct cma_region *reg);
+	struct cma_chunk *(*alloc)(struct cma_region *reg, size_t size,
+				   dma_addr_t alignment);
+	void (*free)(struct cma_chunk *chunk);
+
+	/* unsigned users; */
+	struct list_head list;
+};
+
+
+/**
+ * cma_allocator_register() - Registers an allocator.
+ * @alloc:	Allocator to register.
+ *
+ * Adds allocator to the list of allocators managed by CMA.
+ *
+ * All of the fields of cma_allocator structure must be set except for
+ * optional name and users and list which will be overriden.
+ *
+ * Returns zero or negative error code.
+ */
+int cma_allocator_register(struct cma_allocator *alloc);
+
+
+/**************************** Initialisation API ****************************/
+
+/**
+ * cma_set_defaults() - specifies default command line parameters.
+ * @regions:	Default regions attribute.  May be overriden by "cma"
+ * 		command line argument.
+ * @map:	Default map attribute.  If not set all devices will use
+ * 		regions specified by @asterisk attribute.
+ * @asterisk:	Default asterisk attribute.  If not set will be built from
+ * 		@regions.  Specifies list of regions to use for devices
+ * 		that are missing a rule in @map.
+ *
+ * This function should be called prior to cma_early_regions_reserve()
+ * and after early parameters have been parsed.  The arguments
+ * specifies defalts for various CMA attributes.
+ *
+ * All the pointers can point to strings in __initdata.  @regions
+ * needs to be modifyable.
+ *
+ * Returns negative error code if there was an error parsing the
+ * defs->regions parameter.  Zero otherwise.
+ */
+int __init cma_set_defaults(char *regions, const char *map,
+                            const char *asterisk);
+
+
+/**
+ * cma_early_regions - a list of early regions.
+ *
+ * Platform needs to allocate space for each of the region before
+ * initcalls are executed.  If space is reserved, the reserved flag
+ * must be set.  Platform initialisation code may choose to use
+ * cma_early_regions_allocate().
+ *
+ * Later, during CMA initialisation all reserved regions from the
+ * cma_early_regions list are registered as normal regions and can be
+ * used using standard mechanisms.
+ */
+extern struct list_head cma_early_regions __initdata;
+
+
+/**
+ * cma_early_region_register() - registers an early region.
+ * @reg:	Region to add.
+ *
+ * Region's start, size and alignment must be set.
+ *
+ * If name is set the region will be accessible using normal mechanism
+ * like mapping or cma_alloc_from() function otherwise it will be
+ * a private region accessible only using the cma_alloc_from_region().
+ *
+ * If alloc is set function will try to initialise given allocator
+ * when the early region is "converted" to normal region and
+ * registered during CMA initialisation.  If this failes, the space
+ * will still be reserved but the region won't be registered.
+ *
+ * As usually, alloc_name may point to a name of an allocator to use
+ * (if both alloc and alloc_name aret set, the default will be used).
+ *
+ * All other fields are ignored and/or overwritten.
+ *
+ * Returns zero or negative error.  No checking if regions overlap is
+ * performed.
+ */
+int __init __must_check cma_early_region_register(struct cma_region *reg);
+
+
+/**
+ * cma_early_region_reserve() - reserves a physically contiguous memory region.
+ * @reg:	Early region to reserve memory for.
+ *
+ * If platform supports bootmem this is the first allocator this
+ * function tries to use.  If that failes (or bootmem is not
+ * supported) function tries to use memblec if it is available.
+ *
+ * On success sets reg->reserved flag.
+ *
+ * Returns zero or negative error.
+ */
+int __init cma_early_region_reserve(struct cma_region *reg);
+
+/**
+ * cma_early_regions_reserver() - helper function for reserving early regions.
+ * @reserve:	Callbac function used to reserve space for region.  Needs
+ * 		to return non-negative if allocation succeeded, negative
+ * 		error otherwise.  NULL means cma_early_region_alloc() will
+ * 		be used.
+ *
+ * This function traverses the %cma_early_regions list and tries to
+ * reserve memory for each early region.  It uses the @reserve
+ * callback function for that purpose.  The reserved flag of each
+ * region is updated accordingly.
+ */
+void __init cma_early_regions_reserve(int (*reserve)(struct cma_region *reg));
+
+#else
+
+#define cma_defaults(regions, map, asterisk) ((int)0)
+#define cma_early_regions_reserve(reserve)   do { } while (0)
+
+#endif
+
+#endif
diff --git a/mm/Kconfig b/mm/Kconfig
index f4e516e..468b874 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -301,3 +301,37 @@ config NOMMU_INITIAL_TRIM_EXCESS
 	  of 1 says that all excess pages should be trimmed.
 
 	  See Documentation/nommu-mmap.txt for more information.
+
+
+config CMA
+	bool "Contiguous Memory Allocator framework"
+	# Currently there is only one allocator so force it on
+	select CMA_BEST_FIT
+	help
+	  This enables the Contiguous Memory Allocator framework which
+	  allows drivers to allocate big physically-contiguous blocks of
+	  memory for use with hardware components that do not support I/O
+	  map nor scatter-gather.
+
+	  If you select this option you will also have to select at least
+	  one allocator algorithm below.
+
+	  To make use of CMA you need to specify the regions and
+	  driver->region mapping on command line when booting the kernel.
+
+config CMA_DEBUG
+	bool "CMA debug messages"
+	depends on CMA
+	help
+	  Enable debug messages in CMA code.
+
+config CMA_BEST_FIT
+	bool "CMA best-fit allocator"
+	depends on CMA
+	default y
+	help
+	  This is a best-fit algorithm running in O(n log n) time where
+	  n is the number of existing holes (which is never greater then
+	  the number of allocated regions and usually much smaller).  It
+	  allocates area from the smallest hole that is big enough for
+	  allocation in question.
diff --git a/mm/Makefile b/mm/Makefile
index 34b2546..54b0e99 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -47,3 +47,6 @@ obj-$(CONFIG_MEMORY_FAILURE) += memory-failure.o
 obj-$(CONFIG_HWPOISON_INJECT) += hwpoison-inject.o
 obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o
 obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o
+
+obj-$(CONFIG_CMA) += cma.o
+obj-$(CONFIG_CMA_BEST_FIT) += cma-best-fit.o
diff --git a/mm/cma-best-fit.c b/mm/cma-best-fit.c
new file mode 100644
index 0000000..59515f9
--- /dev/null
+++ b/mm/cma-best-fit.c
@@ -0,0 +1,407 @@
+/*
+ * Contiguous Memory Allocator framework: Best Fit allocator
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz at samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+#define pr_fmt(fmt) "cma: bf: " fmt
+
+#ifdef CONFIG_CMA_DEBUG
+#  define DEBUG
+#endif
+
+#include <linux/errno.h>       /* Error numbers */
+#include <linux/slab.h>        /* kmalloc() */
+
+#include <linux/cma.h>         /* CMA structures */
+
+
+/************************* Data Types *************************/
+
+struct cma_bf_item {
+	struct cma_chunk ch;
+	struct rb_node by_size;
+};
+
+struct cma_bf_private {
+	struct rb_root by_start_root;
+	struct rb_root by_size_root;
+};
+
+
+/************************* Prototypes *************************/
+
+/*
+ * Those are only for holes.  They must be called whenever hole's
+ * properties change but also whenever chunk becomes a hole or hole
+ * becames a chunk.
+ */
+static void __cma_bf_hole_insert_by_size(struct cma_bf_item *item);
+static void __cma_bf_hole_erase_by_size(struct cma_bf_item *item);
+static int  __must_check
+            __cma_bf_hole_insert_by_start(struct cma_bf_item *item);
+static void __cma_bf_hole_erase_by_start(struct cma_bf_item *item);
+
+/**
+ * __cma_bf_hole_take - takes a chunk of memory out of a hole.
+ * @hole:	hole to take chunk from
+ * @size:	chunk's size
+ * @alignment:	chunk's starting address alignment (must be power of two)
+ *
+ * Takes a @size bytes large chunk from hole @hole which must be able
+ * to hold the chunk.  The "must be able" includes also alignment
+ * constraint.
+ *
+ * Returns allocated item or NULL on error (if kmalloc() failed).
+ */
+static struct cma_bf_item *__must_check
+__cma_bf_hole_take(struct cma_bf_item *hole, size_t size, dma_addr_t alignment);
+
+/**
+ * __cma_bf_hole_merge_maybe - tries to merge hole with neighbours.
+ * @item: hole to try and merge
+ *
+ * Which items are preserved is undefined so you may not rely on it.
+ */
+static void __cma_bf_hole_merge_maybe(struct cma_bf_item *item);
+
+
+/************************* Device API *************************/
+
+int cma_bf_init(struct cma_region *reg)
+{
+	struct cma_bf_private *prv;
+	struct cma_bf_item *item;
+
+	prv = kzalloc(sizeof *prv, GFP_KERNEL);
+	if (unlikely(!prv))
+		return -ENOMEM;
+
+	item = kzalloc(sizeof *item, GFP_KERNEL);
+	if (unlikely(!item)) {
+		kfree(prv);
+		return -ENOMEM;
+	}
+
+	item->ch.start = reg->start;
+	item->ch.size  = reg->size;
+	item->ch.reg   = reg;
+
+	rb_root_init(&prv->by_start_root, &item->ch.by_start);
+	rb_root_init(&prv->by_size_root, &item->by_size);
+
+	reg->private_data = prv;
+	return 0;
+}
+
+void cma_bf_cleanup(struct cma_region *reg)
+{
+	struct cma_bf_private *prv = reg->private_data;
+	struct cma_bf_item *item =
+		rb_entry(prv->by_size_root.rb_node,
+			 struct cma_bf_item, by_size);
+
+	/* We can assume there is only a single hole in the tree. */
+	WARN_ON(item->by_size.rb_left || item->by_size.rb_right ||
+		item->ch.by_start.rb_left || item->ch.by_start.rb_right);
+
+	kfree(item);
+	kfree(prv);
+}
+
+struct cma_chunk *cma_bf_alloc(struct cma_region *reg,
+			       size_t size, dma_addr_t alignment)
+{
+	struct cma_bf_private *prv = reg->private_data;
+	struct rb_node *node = prv->by_size_root.rb_node;
+	struct cma_bf_item *item = NULL;
+
+	/* First find hole that is large enough */
+	while (node) {
+		struct cma_bf_item *i =
+			rb_entry(node, struct cma_bf_item, by_size);
+
+		if (i->ch.size < size) {
+			node = node->rb_right;
+		} else if (i->ch.size >= size) {
+			node = node->rb_left;
+			item = i;
+		}
+	}
+	if (!item)
+		return NULL;
+
+	/* Now look for items which can satisfy alignment requirements */
+	for (;;) {
+		dma_addr_t start = ALIGN(item->ch.start, alignment);
+		dma_addr_t end   = item->ch.start + item->ch.size;
+		if (start < end && end - start >= size) {
+			item = __cma_bf_hole_take(item, size, alignment);
+			return likely(item) ? &item->ch : NULL;
+		}
+
+		node = rb_next(node);
+		if (!node)
+			return NULL;
+
+		item  = rb_entry(node, struct cma_bf_item, by_size);
+	}
+}
+
+void cma_bf_free(struct cma_chunk *chunk)
+{
+	struct cma_bf_item *item = container_of(chunk, struct cma_bf_item, ch);
+
+	/* Add new hole */
+	if (unlikely(__cma_bf_hole_insert_by_start(item))) {
+		/*
+		 * We're screwed...  Just free the item and forget
+		 * about it.  Things are broken beyond repair so no
+		 * sense in trying to recover.
+		 */
+		kfree(item);
+	} else {
+		__cma_bf_hole_insert_by_size(item);
+
+		/* Merge with prev and next sibling */
+		__cma_bf_hole_merge_maybe(item);
+	}
+}
+
+
+/************************* Basic Tree Manipulation *************************/
+
+static void __cma_bf_hole_insert_by_size(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	struct rb_node **link = &prv->by_size_root.rb_node, *parent = NULL;
+	const typeof(item->ch.size) value = item->ch.size;
+
+	while (*link) {
+		struct cma_bf_item *i;
+		parent = *link;
+		i = rb_entry(parent, struct cma_bf_item, by_size);
+		link = value <= i->ch.size
+			? &parent->rb_left
+			: &parent->rb_right;
+	}
+
+	rb_link_node(&item->by_size, parent, link);
+	rb_insert_color(&item->by_size, &prv->by_size_root);
+}
+
+static void __cma_bf_hole_erase_by_size(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	rb_erase(&item->by_size, &prv->by_size_root);
+}
+
+static int  __must_check
+            __cma_bf_hole_insert_by_start(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	struct rb_node **link = &prv->by_start_root.rb_node, *parent = NULL;
+	const typeof(item->ch.start) value = item->ch.start;
+
+	while (*link) {
+		struct cma_bf_item *i;
+		parent = *link;
+		i = rb_entry(parent, struct cma_bf_item, ch.by_start);
+
+		if (WARN_ON(value == i->ch.start))
+			/*
+			 * This should *never* happen.  And I mean
+			 * *never*.  We could even BUG on it but
+			 * hopefully things are only a bit broken,
+			 * ie. system can still run.  We produce
+			 * a warning and return an error.
+			 */
+			return -EBUSY;
+
+		link = value <= i->ch.start
+			? &parent->rb_left
+			: &parent->rb_right;
+	}
+
+	rb_link_node(&item->ch.by_start, parent, link);
+	rb_insert_color(&item->ch.by_start, &prv->by_start_root);
+	return 0;
+}
+
+static void __cma_bf_hole_erase_by_start(struct cma_bf_item *item)
+{
+	struct cma_bf_private *prv = item->ch.reg->private_data;
+	rb_erase(&item->ch.by_start, &prv->by_start_root);
+}
+
+
+/************************* More Tree Manipulation *************************/
+
+static struct cma_bf_item *__must_check
+__cma_bf_hole_take(struct cma_bf_item *hole, size_t size, size_t alignment)
+{
+	struct cma_bf_item *item;
+
+	/*
+	 * There are three cases:
+	 * 1. the chunk takes the whole hole,
+	 * 2. the chunk is at the beginning or at the end of the hole, or
+	 * 3. the chunk is in the middle of the hole.
+	 */
+
+
+	/* Case 1, the whole hole */
+	if (size == hole->ch.size) {
+		__cma_bf_hole_erase_by_size(hole);
+		__cma_bf_hole_erase_by_start(hole);
+		return hole;
+	}
+
+
+	/* Allocate */
+	item = kmalloc(sizeof *item, GFP_KERNEL);
+	if (unlikely(!item))
+		return NULL;
+
+	item->ch.start = ALIGN(hole->ch.start, alignment);
+	item->ch.size  = size;
+
+	/* Case 3, in the middle */
+	if (item->ch.start != hole->ch.start
+	 && item->ch.start + item->ch.size !=
+	    hole->ch.start + hole->ch.size) {
+		struct cma_bf_item *tail;
+
+		/*
+		 * Space between the end of the chunk and the end of
+		 * the region, ie. space left after the end of the
+		 * chunk.  If this is dividable by alignment we can
+		 * move the chunk to the end of the hole.
+		 */
+		size_t left =
+			hole->ch.start + hole->ch.size -
+			(item->ch.start + item->ch.size);
+		if (left % alignment == 0) {
+			item->ch.start += left;
+			goto case_2;
+		}
+
+		/*
+		 * We are going to add a hole at the end.  This way,
+		 * we will reduce the problem to case 2 -- the chunk
+		 * will be at the end of the hole.
+		 */
+		tail = kmalloc(sizeof *tail, GFP_KERNEL);
+		if (unlikely(!tail)) {
+			kfree(item);
+			return NULL;
+		}
+
+		tail->ch.start = item->ch.start + item->ch.size;
+		tail->ch.size  =
+			hole->ch.start + hole->ch.size - tail->ch.start;
+		tail->ch.reg   = hole->ch.reg;
+
+		if (unlikely(__cma_bf_hole_insert_by_start(tail))) {
+			/*
+			 * Things are broken beyond repair...  Abort
+			 * inserting the hole but still continue with
+			 * allocation (seems like the best we can do).
+			 */
+
+			hole->ch.size = tail->ch.start - hole->ch.start;
+			kfree(tail);
+		} else {
+			__cma_bf_hole_insert_by_size(tail);
+			/*
+			 * It's important that we first insert the new
+			 * hole in the tree sorted by size and later
+			 * reduce the size of the old hole.  We will
+			 * update the position of the old hole in the
+			 * rb tree in code that handles case 2.
+			 */
+			hole->ch.size = tail->ch.start - hole->ch.start;
+		}
+
+		/* Go to case 2 */
+	}
+
+
+	/* Case 2, at the beginning or at the end */
+case_2:
+	/* No need to update the tree; order preserved. */
+	if (item->ch.start == hole->ch.start)
+		hole->ch.start += item->ch.size;
+
+	/* Alter hole's size */
+	hole->ch.size -= size;
+	__cma_bf_hole_erase_by_size(hole);
+	__cma_bf_hole_insert_by_size(hole);
+
+	return item;
+}
+
+
+static void __cma_bf_hole_merge_maybe(struct cma_bf_item *item)
+{
+	struct cma_bf_item *prev;
+	struct rb_node *node;
+	int twice = 2;
+
+	node = rb_prev(&item->ch.by_start);
+	if (unlikely(!node))
+		goto next;
+	prev = rb_entry(node, struct cma_bf_item, ch.by_start);
+
+	for (;;) {
+		if (prev->ch.start + prev->ch.size == item->ch.start) {
+			/* Remove previous hole from trees */
+			__cma_bf_hole_erase_by_size(prev);
+			__cma_bf_hole_erase_by_start(prev);
+
+			/* Alter this hole */
+			item->ch.size += prev->ch.size;
+			item->ch.start = prev->ch.start;
+			__cma_bf_hole_erase_by_size(item);
+			__cma_bf_hole_insert_by_size(item);
+			/*
+			 * No need to update by start trees as we do
+			 * not break sequence order
+			 */
+
+			/* Free prev hole */
+			kfree(prev);
+		}
+
+next:
+		if (!--twice)
+			break;
+
+		node = rb_next(&item->ch.by_start);
+		if (unlikely(!node))
+			break;
+		prev = item;
+		item = rb_entry(node, struct cma_bf_item, ch.by_start);
+	}
+}
+
+
+
+/************************* Register *************************/
+static int cma_bf_module_init(void)
+{
+	static struct cma_allocator alloc = {
+		.name    = "bf",
+		.init    = cma_bf_init,
+		.cleanup = cma_bf_cleanup,
+		.alloc   = cma_bf_alloc,
+		.free    = cma_bf_free,
+	};
+	return cma_allocator_register(&alloc);
+}
+module_init(cma_bf_module_init);
diff --git a/mm/cma.c b/mm/cma.c
new file mode 100644
index 0000000..0b02ab0
--- /dev/null
+++ b/mm/cma.c
@@ -0,0 +1,1170 @@
+/*
+ * Contiguous Memory Allocator framework
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz at samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+/*
+ * See Documentation/contiguous-memory.txt for details.
+ */
+
+#define pr_fmt(fmt) "cma: " fmt
+
+#ifdef CONFIG_CMA_DEBUG
+#  define DEBUG
+#endif
+
+#ifndef CONFIG_NO_BOOTMEM
+#  include <linux/bootmem.h>   /* alloc_bootmem_pages_nopanic() */
+#endif
+#ifdef CONFIG_HAVE_MEMBLOCK
+#  include <linux/memblock.h>  /* memblock*() */
+#endif
+#include <linux/device.h>      /* struct device, dev_name() */
+#include <linux/errno.h>       /* Error numbers */
+#include <linux/err.h>         /* IS_ERR, PTR_ERR, etc. */
+#include <linux/mm.h>          /* PAGE_ALIGN() */
+#include <linux/module.h>      /* EXPORT_SYMBOL_GPL() */
+#include <linux/mutex.h>       /* mutex */
+#include <linux/slab.h>        /* kmalloc() */
+#include <linux/string.h>      /* str*() */
+
+#include <linux/cma.h>
+
+
+/*
+ * Protects cma_regions, cma_allocators, cma_attr_map (which means
+ * cma_map and cma_map_len), cma_attr_asterisk (which means
+ * cma_asterisk and cma_asterisk_len) and cma_chunks_by_start.
+ */
+static DEFINE_MUTEX(cma_mutex);
+
+static struct kobject *cma_kobj;
+
+
+/************************* Attributes *************************/
+
+struct cma_attribute {
+	struct kobj_attribute kattr;
+	const char *value;
+	size_t length;
+	ssize_t (*validate)(const char *value);
+};
+
+#define __CMA_ATTR(_name) \
+	__ATTR(_name, 0644, cma_attr_show, cma_attr_store)
+
+static ssize_t cma_attr_show(struct kobject *kobj,
+			     struct kobj_attribute *_attr, char *page)
+{
+	struct cma_attribute *attr =
+		container_of(_attr, struct cma_attribute, kattr);
+	ssize_t len;
+
+	mutex_lock(&cma_mutex);
+
+	if (!attr->value || !attr->length) {
+		*page = 0;
+		len = 0;
+	} else {
+		len = min(attr->length, (size_t)PAGE_SIZE - 2);
+		memcpy(page, attr->value, len);
+		page[len]   = '\n';
+		page[++len] = '\0';
+	}
+
+	mutex_unlock(&cma_mutex);
+
+	return len;
+}
+
+static ssize_t cma_attr_store(struct kobject *kobj,
+			      struct kobj_attribute *_attr,
+			      const char *page, size_t count)
+{
+	struct cma_attribute *attr =
+		container_of(_attr, struct cma_attribute, kattr);
+	ssize_t len = attr->validate(page);
+	char *val = NULL;
+
+	if (len < 0)
+		return len;
+
+	if (len) {
+		val = kmemdup(page, len + 1, GFP_KERNEL);
+		if (!val)
+			return -ENOMEM;
+		val[len] = '\0';
+	}
+
+	mutex_lock(&cma_mutex);
+
+	kfree(attr->value);
+	attr->value = val;
+	attr->length = len;
+
+	mutex_unlock(&cma_mutex);
+
+	return count;
+}
+
+
+/*
+ * param        ::= [ rules [ ';' ] ]
+ * rules        ::= rule [ ';' rules ]
+ * rule         ::= patterns '=' regions
+ * patterns     ::= pattern [ ',' patterns ]
+ *
+ * regions      ::= reg-name [ ',' regions ] | '*'
+ *              // list of regions to try to allocate memory
+ *              // from for devices that match pattern
+ *
+ * pattern      ::= dev-pattern [ '/' kind-pattern ]
+ *                | '/' kind-pattern
+ *              // pattern request must match for this rule to
+ *              // apply to it; the first rule that matches is
+ *              // applied; if dev-pattern part is omitted
+ *              // value identical to the one used in previous
+ *              // rule is assumed
+ *
+ * See Documentation/contiguous-memory.txt for details.
+ *
+ * Example (white space added for convenience, forbidden in real string):
+ * cma_map = foo-dev = reg1;             -- foo-dev with no kind
+ *           bar-dev / firmware = reg3;  -- bar-dev's firmware
+ *           / * = reg2;                 -- bar-dev's all other kinds
+ *           baz-dev / * = reg1,reg2;    -- any kind of baz-dev
+ *           * / * = reg2,reg1;          -- any other allocations
+ */
+static ssize_t cma_map_validate(const char *param)
+{
+	const char *ch = param;
+
+	if (*ch == '\0' || *ch == '\n')
+		return 0;
+
+	for (;;) {
+		const char *start = ch;
+
+		while (*ch && *ch != '\n' && *ch != ';' && *ch != '=')
+			++ch;
+
+		if (*ch != '=' || start == ch) {
+			pr_err("map: expecting \"<patterns>=<regions>\" near %s\n", start);
+			return -EINVAL;
+		}
+
+		if (*ch == '*' && (ch[1] && ch[1] != '\n' && ch[1] != ';')) {
+			pr_err("map: end of ';' expecting after '*' near %s\n", start);
+			return -EINVAL;
+		}
+
+		while (*++ch != ';')
+			if (!*ch || *ch == '\n')
+				return ch - param;
+		if (ch[1] == '\0' || ch[1] == '\n')
+			return ch - param;
+		++ch;
+	}
+}
+
+
+static struct cma_attribute cma_attributes[] = {
+	{
+		.kattr		= __CMA_ATTR(map),
+		.validate	= &cma_map_validate
+	},
+	{
+		.kattr		= __CMA_ATTR(asterisk),
+		.validate	= (ssize_t (*)(const char *))&strlen,
+	},
+};
+
+#define cma_map       (cma_attributes[0].value)
+#define cma_asterisk  (cma_attributes[1].value)
+
+
+static int __init cma_attributes_init(void)
+{
+	struct cma_attribute *attr = cma_attributes;
+	int i = ARRAY_SIZE(cma_attributes), ret = 0;
+
+	do {
+		if (attr->value) {
+			/* Apply defaults.  Ignore parsing errors. */
+			const char *v = attr->value;
+			attr->value = NULL;
+			cma_attr_store(NULL, &attr->kattr, v, 0);
+		}
+
+		/*
+		 * Even if there was an error creating one file
+		 * initialise the rest.
+		 */
+		if (cma_kobj && !ret)
+			ret = sysfs_create_file(cma_kobj, &attr->kattr.attr);
+	} while (++attr, --i);
+
+	return ret;
+}
+
+
+
+/************************* Early parameter *************************/
+
+struct list_head cma_early_regions __initdata =
+	LIST_HEAD_INIT(cma_early_regions);
+
+/*
+ * regions-attr ::= [ regions [ ';' ] ]
+ * regions      ::= region [ ';' regions ]
+ *
+ * region       ::= [ '-' ] reg-name
+ *                    '=' size
+ *                  [ '@' start ]
+ *                  [ '/' alignment ]
+ *                  [ ':' alloc-name ]
+ *
+ * See Documentation/contiguous-memory.txt for details.
+ *
+ * Example:
+ * cma=reg1=64M:bf;reg2=32M at 0x100000:bf;reg3=64M/1M:bf
+ *
+ * If allocator is ommited the first available allocater will be used.
+ */
+
+#define NUMPARSE(cond_ch, type, cond) ({				\
+		unsigned long long v = 0;				\
+		if (*param == (cond_ch)) {				\
+			const char *const msg = param + 1;		\
+			v = memparse(msg, &param);			\
+			if (!v || v > ~(type)0 || !(cond)) {		\
+				pr_err("param: invalid value near %s\n", msg); \
+				ret = -EINVAL;				\
+				break;					\
+			}						\
+		}							\
+		v;							\
+	})
+
+static int __init cma_param_parse(char *param)
+{
+	static int already_parsed __initdata;
+
+	static struct cma_region regions[16];
+	static char buffer[128] __initdata;
+
+	size_t left = ARRAY_SIZE(regions);
+	struct cma_region *reg = regions;
+
+	size_t buf_left = ARRAY_SIZE(buffer);
+	char *buf = buffer;
+
+	int ret = 0;
+
+	pr_debug("param: %s\n", param);
+
+	if (already_parsed)
+		return 0;
+	already_parsed = 1;
+
+	for (; *param; ++reg) {
+		dma_addr_t start, alignment;
+		size_t size;
+		int minus;
+
+		if (unlikely(!--left)) {
+			pr_err("param: too many early regions\n");
+			return -ENOSPC;
+		}
+
+		/* Parse name */
+		minus = *param == '-';
+		if (minus)
+			++param;
+
+		reg->name = param;
+		param = strchr(param, '=');
+		if (!param || param == reg->name) {
+			pr_err("param: expected \"<name>=\" near %s\n",
+			       reg->name);
+			ret = -EINVAL;
+			break;
+		}
+		*param = '\0';
+
+		/* Copy name to buffer */
+		if (!minus) {
+			size_t len = param - reg->name;
+			if (len >= buf_left) {
+				pr_err("param: names too long\n");
+				ret = -ENOSPC;
+				break;
+			}
+
+			memcpy(buf, reg->name, len);
+			buf += len;
+			*buf = ',';
+			++buf;
+			buf_left -= len;
+		}
+
+		/* Parse numbers */
+		size      = NUMPARSE('\0', size_t, true);
+		start     = NUMPARSE('@', dma_addr_t, true);
+		alignment = NUMPARSE('/', dma_addr_t, (v & (v - 1)) == 0);
+
+		alignment = max(alignment, (dma_addr_t)PAGE_SIZE);
+		start     = ALIGN(start, alignment);
+		size      = PAGE_ALIGN(size);
+		if (start + size < start) {
+			pr_err("param: invalid start, size combination\n");
+			ret = -EINVAL;
+			break;
+		}
+
+		/* Parse allocator */
+		if (*param == ':') {
+			reg->alloc_name = ++param;
+			while (*param && *param != ';')
+				++param;
+			if (param == reg->alloc_name)
+				reg->alloc_name = NULL;
+		}
+
+		/* Go to next */
+		if (*param == ';') {
+			*param = '\0';
+			++param;
+		} else if (*param) {
+			pr_err("param: expecting ';' or end of parameter near %s\n",
+			       param);
+			ret = -EINVAL;
+			break;
+		}
+
+		/* Add */
+		reg->size      = size;
+		reg->start     = start;
+		reg->alignment = alignment;
+		reg->copy_name = 1;
+
+		list_add_tail(&reg->list, &cma_early_regions);
+
+		pr_debug("param: registering early region %s (%p@%p/%p)\n",
+			 reg->name, (void *)reg->size, (void *)reg->start,
+			 (void *)reg->alignment);
+	}
+
+	if (buf != buffer) {
+		buf[-1] = '\0';
+		cma_asterisk = buffer;
+	}
+	return ret;
+}
+early_param("cma", cma_param_parse);
+
+#undef NUMPARSE
+
+
+int __init __must_check cma_early_region_register(struct cma_region *reg)
+{
+	dma_addr_t start, alignment;
+	size_t size;
+
+	if (reg->alignment & (reg->alignment - 1))
+		return -EINVAL;
+
+	alignment = max(reg->alignment, (dma_addr_t)PAGE_SIZE);
+	start     = ALIGN(reg->start, alignment);
+	size      = PAGE_ALIGN(reg->size);
+
+	if (start + size < start)
+		return -EINVAL;
+
+	reg->size      = size;
+	reg->start     = start;
+	reg->alignment = alignment;
+
+	list_add_tail(&reg->list, &cma_early_regions);
+
+	pr_debug("param: registering early region %s (%p@%p/%p)\n",
+		 reg->name, (void *)reg->size, (void *)reg->start,
+		 (void *)reg->alignment);
+
+	return 0;
+}
+
+
+
+/************************* Regions & Allocators *************************/
+
+static int __cma_region_attach_alloc(struct cma_region *reg);
+
+
+/* List of all regions.  Named regions are kept before unnamed. */
+static LIST_HEAD(cma_regions);
+
+#define cma_foreach_region(reg) \
+	list_for_each_entry(reg, &cma_regions, list)
+
+int __must_check cma_region_register(struct cma_region *reg)
+{
+	const char *name, *alloc_name;
+	struct cma_region *r;
+	char *ch = NULL;
+	int ret = 0;
+
+	if (!reg->size || reg->start + reg->size < reg->start)
+		return -EINVAL;
+
+	/* reg->users = 0; */
+	reg->used = 0;
+	reg->private_data = NULL;
+	reg->registered = 0;
+	reg->free_space = reg->size;
+
+	/* Copy name and alloc_name */
+	name = reg->name;
+	alloc_name = reg->alloc_name;
+	if (reg->copy_name && (reg->name || reg->alloc_name)) {
+		size_t name_size, alloc_size;
+
+		name_size  = reg->name       ? strlen(reg->name) + 1       : 0;
+		alloc_size = reg->alloc_name ? strlen(reg->alloc_name) + 1 : 0;
+
+		ch = kmalloc(name_size + alloc_size, GFP_KERNEL);
+		if (!ch) {
+			pr_err("%s: not enough memory to allocate name\n",
+			       reg->name ?: "(private)");
+			return -ENOMEM;
+		}
+
+		if (name_size) {
+			memcpy(ch, reg->name, name_size);
+			name = ch;
+			ch += name_size;
+		}
+
+		if (alloc_size) {
+			memcpy(ch, reg->alloc_name, alloc_size);
+			alloc_name = ch;
+		}
+	}
+
+	mutex_lock(&cma_mutex);
+
+	/* Don't let regions overlap */
+	cma_foreach_region(r)
+		if (r->start + r->size > reg->start &&
+		    r->start < reg->start + reg->size) {
+			ret = -EADDRINUSE;
+			goto done;
+		}
+
+	if (reg->alloc) {
+		ret = __cma_region_attach_alloc(reg);
+		if (unlikely(ret < 0))
+			goto done;
+	}
+
+	reg->name = name;
+	reg->alloc_name = alloc_name;
+	reg->registered = 1;
+	ch = NULL;
+
+	/*
+	 * Keep named at the beginning and unnamed (private) at the
+	 * end.  This helps in traversal when named region is looked
+	 * for.
+	 */
+	if (name)
+		list_add(&reg->list, &cma_regions);
+	else
+		list_add_tail(&reg->list, &cma_regions);
+
+done:
+	mutex_unlock(&cma_mutex);
+
+	pr_debug("%s: region %sregistered\n",
+		 reg->name ?: "(private)", ret ? "not " : "");
+	if (ch)
+		kfree(ch);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(cma_region_register);
+
+static struct cma_region *__must_check
+__cma_region_find(const char **namep)
+{
+	struct cma_region *reg;
+	const char *ch, *name;
+	size_t n;
+
+	for (ch = *namep; *ch && *ch != ',' && *ch != ';'; ++ch)
+		/* nop */;
+	name = *namep;
+	*namep = *ch ? ch : (ch + 1);
+	n = ch - name;
+
+	/*
+	 * Named regions are kept in front of unnamed so if we
+	 * encounter unnamed region we can stop.
+	 */
+	cma_foreach_region(reg)
+		if (!reg->name)
+			break;
+		else if (!strncmp(name, reg->name, n) && !reg->name[n])
+			return reg;
+
+	return NULL;
+}
+
+
+/* List of all allocators. */
+static LIST_HEAD(cma_allocators);
+
+#define cma_foreach_allocator(alloc) \
+	list_for_each_entry(alloc, &cma_allocators, list)
+
+int cma_allocator_register(struct cma_allocator *alloc)
+{
+	struct cma_region *reg;
+	int first;
+
+	if (!alloc->alloc || !alloc->free)
+		return -EINVAL;
+
+	/* alloc->users = 0; */
+
+	mutex_lock(&cma_mutex);
+
+	first = list_empty(&cma_allocators);
+
+	list_add_tail(&alloc->list, &cma_allocators);
+
+	/*
+	 * Attach this allocator to all allocator-less regions that
+	 * request this particular allocator (reg->alloc_name equals
+	 * alloc->name) or if region wants the first available
+	 * allocator and we are the first.
+	 */
+	cma_foreach_region(reg) {
+		if (reg->alloc)
+			continue;
+		if (reg->alloc_name
+		  ? alloc->name && !strcmp(alloc->name, reg->alloc_name)
+		  : (!reg->used && first))
+			continue;
+
+		reg->alloc = alloc;
+		__cma_region_attach_alloc(reg);
+	}
+
+	mutex_unlock(&cma_mutex);
+
+	pr_debug("%s: allocator registered\n", alloc->name ?: "(unnamed)");
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(cma_allocator_register);
+
+static struct cma_allocator *__must_check
+__cma_allocator_find(const char *name)
+{
+	struct cma_allocator *alloc;
+
+	if (!name)
+		return list_empty(&cma_allocators)
+			? NULL
+			: list_entry(cma_allocators.next,
+				     struct cma_allocator, list);
+
+	cma_foreach_allocator(alloc)
+		if (alloc->name && !strcmp(name, alloc->name))
+			return alloc;
+
+	return NULL;
+}
+
+
+
+/************************* Initialise CMA *************************/
+
+int __init cma_set_defaults(char *regions, const char *map,
+                            const char *asterisk)
+{
+	if (map)
+		cma_map = map;
+
+	if (asterisk)
+		cma_asterisk = asterisk;
+
+	if (regions)
+		return cma_param_parse(regions);
+	else
+		return 0;
+}
+
+
+int __init cma_early_region_reserve(struct cma_region *reg)
+{
+	int tried = 0;
+
+	if (!reg->size || (reg->alignment & (reg->alignment - 1)) ||
+	    reg->reserved)
+		return -EINVAL;
+
+#ifndef CONFIG_NO_BOOTMEM
+
+	tried = 1;
+
+	{
+		void *ptr;
+
+		ptr = __alloc_bootmem_nopanic(reg->size, reg->alignment,
+					      reg->start);
+		if (ptr) {
+			reg->start = virt_to_phys(ptr);
+			reg->reserved = 1;
+			return 0;
+		}
+	}
+
+#endif
+
+#ifdef CONFIG_HAVE_MEMBLOCK
+
+	tried = 1;
+
+	if (reg->start) {
+		if (memblock_is_region_reserved(reg->start, reg->size) < 0 &&
+		    memblock_reserve(reg->start, reg->size) >= 0) {
+			reg->reserved = 1;
+			return 0;
+		}
+	} else {
+		/*
+		 * Use __memblock_alloc_base() since
+		 * memblock_alloc_base() panic()s.
+		 */
+		u64 ret = __memblock_alloc_base(reg->size, reg->alignment, 0);
+		if (ret &&
+		    ret < ~(dma_addr_t)0 &&
+		    ret + reg->size < ~(dma_addr_t)0 &&
+		    ret + reg->size > ret) {
+			reg->start = ret;
+			reg->reserved = 1;
+			return 0;
+		}
+
+		if (ret)
+			memblock_free(ret, reg->size);
+	}
+
+#endif
+
+	return tried ? -ENOMEM : -EOPNOTSUPP;
+}
+
+void __init cma_early_regions_reserve(int (*reserve)(struct cma_region *reg))
+{
+	struct cma_region *reg;
+
+	pr_debug("init: reserving early regions\n");
+
+	if (!reserve)
+		reserve = cma_early_region_reserve;
+
+	list_for_each_entry(reg, &cma_early_regions, list) {
+		if (reg->reserved) {
+			/* nothing */
+		} else if (reserve(reg) >= 0) {
+			pr_debug("init: %s: reserved %p@%p\n",
+				 reg->name ?: "(private)",
+				 (void *)reg->size, (void *)reg->start);
+			reg->reserved = 1;
+		} else {
+			pr_warn("init: %s: unable to reserve %p@%p/%p\n",
+				reg->name ?: "(private)",
+				(void *)reg->size, (void *)reg->start,
+				(void *)reg->alignment);
+		}
+	}
+}
+
+
+static int __init cma_init(void)
+{
+	struct cma_region *reg, *n;
+
+	pr_debug("init: initialising\n");
+
+	cma_kobj = kobject_create_and_add("contiguous", mm_kobj);
+	if (unlikely(!cma_kobj))
+		pr_err("init: unable to add sysfs directory\n");
+
+	cma_attributes_init();
+
+	list_for_each_entry_safe(reg, n, &cma_early_regions, list) {
+		INIT_LIST_HEAD(&reg->list);
+		/*
+		 * We don't care if there was an error.  It's a pity
+		 * but there's not much we can do about it any way.
+		 * If the error is on a region that was parsed from
+		 * command line then it will stay and waste a bit of
+		 * space; if it was registered using
+		 * cma_early_region_register() it's caller's
+		 * responsibility to do something about it.
+		 */
+		if (reg->reserved && cma_region_register(reg) < 0)
+			/* ignore error */;
+	}
+
+	INIT_LIST_HEAD(&cma_early_regions);
+
+	return 0;
+}
+__initcall(cma_init);
+
+
+
+/************************* Chunks *************************/
+
+/* All chunks sorted by start address. */
+static struct rb_root cma_chunks_by_start;
+
+static struct cma_chunk *__must_check __cma_chunk_find(dma_addr_t addr)
+{
+	struct cma_chunk *chunk;
+	struct rb_node *n;
+
+	for (n = cma_chunks_by_start.rb_node; n; ) {
+		chunk = rb_entry(n, struct cma_chunk, by_start);
+		if (addr < chunk->start)
+			n = n->rb_left;
+		else if (addr > chunk->start)
+			n = n->rb_right;
+		else
+			return chunk;
+	}
+	WARN("no chunk starting at %p\n", (void *)addr);
+	return NULL;
+}
+
+static int __must_check __cma_chunk_insert(struct cma_chunk *chunk)
+{
+	struct rb_node **new, *parent = NULL;
+	typeof(chunk->start) addr = chunk->start;
+
+	for (new = &cma_chunks_by_start.rb_node; *new; ) {
+		struct cma_chunk *c =
+			container_of(*new, struct cma_chunk, by_start);
+
+		parent = *new;
+		if (addr < c->start) {
+			new = &(*new)->rb_left;
+		} else if (addr > c->start) {
+			new = &(*new)->rb_right;
+		} else {
+			/*
+			 * We should never be here.  If we are it
+			 * means allocator gave us an invalid chunk
+			 * (one that has already been allocated) so we
+			 * refuse to accept it.  Our caller will
+			 * recover by freeing the chunk.
+			 */
+			WARN_ON(1);
+			return -EADDRINUSE;
+		}
+	}
+
+	rb_link_node(&chunk->by_start, parent, new);
+	rb_insert_color(&chunk->by_start, &cma_chunks_by_start);
+
+	return 0;
+}
+
+static void __cma_chunk_release(struct kref *ref)
+{
+	struct cma_chunk *chunk = container_of(ref, struct cma_chunk, ref);
+
+	rb_erase(&chunk->by_start, &cma_chunks_by_start);
+
+	chunk->reg->alloc->free(chunk);
+	/* --chunk->reg->users; */
+	chunk->reg->free_space += chunk->size;
+}
+
+
+
+/************************* The Device API *************************/
+
+static const char *__must_check
+__cma_where_from(const struct device *dev, const char *kind);
+
+
+/* Allocate. */
+
+static dma_addr_t __must_check
+__cma_alloc_from_region(struct cma_region *reg,
+			size_t size, dma_addr_t alignment)
+{
+	struct cma_chunk *chunk;
+
+	if (!reg || reg->free_space < size)
+		return -ENOMEM;
+
+	if (!reg->alloc) {
+		if (!reg->used)
+			__cma_region_attach_alloc(reg);
+		if (!reg->alloc)
+			return -ENOMEM;
+	}
+
+	chunk = reg->alloc->alloc(reg, size, alignment);
+	if (!chunk)
+		return -ENOMEM;
+
+	if (unlikely(__cma_chunk_insert(chunk) < 0)) {
+		/* We should *never* be here. */
+		chunk->reg->alloc->free(chunk);
+		kfree(chunk);
+		return -EADDRINUSE;
+	}
+
+	chunk->reg = reg;
+	kref_init(&chunk->ref);
+	/* ++reg->users; */
+	reg->free_space -= chunk->size;
+	pr_debug("allocated at %p\n", (void *)chunk->start);
+	return chunk->start;
+}
+
+dma_addr_t __must_check
+cma_alloc_from_region(struct cma_region *reg,
+		      size_t size, dma_addr_t alignment)
+{
+	dma_addr_t addr;
+
+	mutex_lock(&cma_mutex);
+
+	pr_debug("allocate %p/%p from %s\n",
+		 (void *)size, (void *)alignment,
+		 reg ? reg->name ?: "(private)" : "(null)");
+
+	if (!size || alignment & (alignment - 1) || !reg->registered)
+		addr = -EINVAL;
+	else
+		addr = __cma_alloc_from_region(reg, PAGE_ALIGN(size),
+					       max(alignment, (dma_addr_t)PAGE_SIZE));
+
+	mutex_unlock(&cma_mutex);
+
+	return addr;
+}
+EXPORT_SYMBOL_GPL(cma_alloc_from_region);
+
+dma_addr_t __must_check
+__cma_alloc(const struct device *dev, const char *kind,
+	    dma_addr_t size, dma_addr_t alignment)
+{
+	struct cma_region *reg;
+	const char *from;
+	dma_addr_t addr;
+
+	if (dev)
+		pr_debug("allocate %p/%p for %s/%s\n",
+			 (void *)size, (void *)alignment,
+			 dev_name(dev), kind ?: "");
+
+	if (!size || alignment & (alignment - 1))
+		return -EINVAL;
+
+	mutex_lock(&cma_mutex);
+
+	from = __cma_where_from(dev, kind);
+	if (unlikely(IS_ERR(from))) {
+		addr = PTR_ERR(from);
+		goto done;
+	}
+
+	pr_debug("allocate %p/%p from one of %s\n",
+		 (void *)size, (void *)alignment, from);
+
+	size = PAGE_ALIGN(size);
+	if (alignment < PAGE_SIZE)
+		alignment = PAGE_SIZE;
+
+	while (*from && *from != ';') {
+		reg = __cma_region_find(&from);
+		addr = __cma_alloc_from_region(reg, size, alignment);
+		if (!IS_ERR_VALUE(addr))
+			goto done;
+	}
+
+	pr_debug("not enough memory\n");
+	addr = -ENOMEM;
+
+done:
+	mutex_unlock(&cma_mutex);
+
+	return addr;
+}
+EXPORT_SYMBOL_GPL(__cma_alloc);
+
+
+/* Query information about regions. */
+int
+__cma_info(struct cma_info *infop, const struct device *dev, const char *kind)
+{
+	struct cma_info info = { ~(dma_addr_t)0, 0, 0, 0, 0 };
+	const char *from;
+	int ret;
+
+	if (unlikely(!infop))
+		return -EINVAL;
+
+	mutex_lock(&cma_mutex);
+
+	from = __cma_where_from(dev, kind);
+	if (IS_ERR(from)) {
+		ret = PTR_ERR(from);
+		info.lower_bound = 0;
+		goto done;
+	}
+
+	while (*from && *from != ';') {
+		struct cma_region *reg = __cma_region_find(&from);
+		if (reg) {
+			info.total_size += reg->size;
+			info.free_size += reg->free_space;
+			if (info.lower_bound > reg->start)
+				info.lower_bound = reg->start;
+			if (info.upper_bound < reg->start + reg->size)
+				info.upper_bound = reg->start + reg->size;
+			++info.count;
+		}
+	}
+
+	ret = 0;
+done:
+	mutex_unlock(&cma_mutex);
+	memcpy(infop, &info, sizeof info);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(__cma_info);
+
+
+/* kref handling. */
+
+int cma_get(dma_addr_t addr)
+{
+	struct cma_chunk *c;
+
+	mutex_lock(&cma_mutex);
+	c = __cma_chunk_find(addr);
+	if (c)
+		kref_get(&c->ref);
+	mutex_unlock(&cma_mutex);
+
+	pr_debug("get(%p): %sfound\n", (void *)addr, c ? "" : "not ");
+	return c ? 0 : -ENOENT;
+}
+EXPORT_SYMBOL_GPL(cma_get);
+
+int cma_put(dma_addr_t addr)
+{
+	struct cma_chunk *c;
+	int ret;
+
+	mutex_lock(&cma_mutex);
+	c = __cma_chunk_find(addr);
+	ret = c ? kref_put(&c->ref, __cma_chunk_release) : -ENOENT;
+	mutex_unlock(&cma_mutex);
+
+	pr_debug("put(%p): %sfound%s\n", (void *)addr, c ? "" : "not ",
+		 ret > 0 ? " and destroyed" : "");
+	return ret;
+}
+EXPORT_SYMBOL_GPL(cma_put);
+
+
+/************************* Miscellaneous *************************/
+
+static int __cma_region_attach_alloc(struct cma_region *reg)
+{
+	struct cma_allocator *alloc;
+	int ret;
+
+	/*
+	 * If reg->alloc is set then caller wants us to use this
+	 * allocator.  Otherwise we need to find one by name.
+	 */
+	if (reg->alloc) {
+		alloc = reg->alloc;
+	} else {
+		alloc = __cma_allocator_find(reg->alloc_name);
+		if (!alloc) {
+			pr_warn("init: %s: %s: no such allocator\n",
+				reg->name ?: "(private)",
+				reg->alloc_name ?: "(default)");
+			reg->used = 1;
+			return -ENOENT;
+		}
+	}
+
+	/* Try to initialise the allocator. */
+	reg->private_data = NULL;
+	ret = alloc->init ? alloc->init(reg) : 0;
+	if (unlikely(ret < 0)) {
+		pr_err("init: %s: %s: unable to initialise allocator\n",
+		       reg->name ?: "(private)", alloc->name ?: "(unnamed)");
+		reg->alloc = NULL;
+		reg->used = 1;
+	} else {
+		reg->alloc = alloc;
+		/* ++alloc->users; */
+		pr_debug("init: %s: %s: initialised allocator\n",
+			 reg->name ?: "(private)", alloc->name ?: "(unnamed)");
+	}
+	return ret;
+}
+
+
+/*
+ * s            ::= rules
+ * rules        ::= rule [ ';' rules ]
+ * rule         ::= patterns '=' [ regions ]
+ * patterns     ::= pattern [ ',' patterns ]
+ * pattern      ::= dev-pattern [ '/' kind-pattern ]
+ *                | '/' kind-pattern
+ */
+static const char *__must_check
+__cma_where_from(const struct device *dev, const char *kind)
+{
+	/*
+	 * This function matches the pattern from the map attribute
+	 * agains given device name and kind.  Kind may be of course
+	 * NULL or an emtpy string.
+	 */
+
+	const char *s, *name;
+	int name_matched = 0;
+
+	/*
+	 * If dev is NULL we were called in alternative form where
+	 * kind is the from string.  All we have to do is return it
+	 * unless it's NULL or "*" in which case we return
+	 * cma_asterisk.
+	 */
+	if (!dev) {
+		if (!kind || *kind == '*')
+			return cma_asterisk ?: ERR_PTR(-ENOENT);
+		else
+			return kind;
+	}
+
+	if (!cma_map)
+		return cma_asterisk ?: ERR_PTR(-ENOENT);
+
+	name = dev_name(dev);
+	if (WARN_ON(!name || !*name))
+		return ERR_PTR(-EINVAL);
+
+	if (!kind)
+		kind = "";
+
+	/*
+	 * Now we go throught the cma_map parameter.  It is what has
+	 * been provided by command line.
+	 */
+	for (s = cma_map; *s; ++s) {
+		const char *c;
+
+		/*
+		 * If the pattern starts with a slash, the device part of the
+		 * pattern matches if it matched previously.
+		 */
+		if (*s == '/') {
+			if (!name_matched)
+				goto look_for_next;
+			goto match_kind;
+		}
+
+		/*
+		 * We are now trying to match the device name.  This also
+		 * updates the name_matched variable.  If, while reading the
+		 * spec, we ecnounter comma it means that the pattern does not
+		 * match and we need to start over with another pattern (the
+		 * one afther the comma).  If we encounter equal sign we need
+		 * to start over with another rule.  If there is a character
+		 * that does not match, we neet to look for a comma (to get
+		 * another pattern) or semicolon (to get another rule) and try
+		 * again if there is one semowhere.
+		 */
+
+		name_matched = 0;
+
+		for (c = name; *s != '*' && *c; ++c, ++s)
+			if (*s == '=')
+				goto next_rule;
+			else if (*s == ',')
+				continue;
+			else if (*s != '?' && *c != *s)
+				goto look_for_next;
+		if (*s == '*')
+			++s;
+
+		name_matched = 1;
+
+		/*
+		 * Now we need to match the kind part of the pattern.  If the
+		 * pattern is missing it we match only if kind points to an
+		 * empty string.  Otherwise wy try to match it just like name.
+		 */
+		if (*s != '/') {
+			if (*kind)
+				goto look_for_next;
+		} else {
+match_kind:		/* s points to '/' */
+			++s;
+
+			for (c = kind; *s != '*' && *c; ++c, ++s)
+				if (*s == '=')
+					goto next_rule;
+				else if (*s == ',')
+					continue;
+				else if (*s != '?' && *c != *s)
+					goto look_for_next;
+			if (*s == '*')
+				++s;
+		}
+
+		/* Return the string behind the '=' sign of the rule. */
+		if (*s == '=' || *s == ',') {
+			s = strchr(s, '=') + 1;
+			return *s == '*' ? cma_asterisk ?: ERR_PTR(-ENOENT) : s;
+		}
+
+look_for_next:
+		do {
+			++s;
+		} while (*s != ',' && *s != '=');
+		if (*s == ',')
+			continue;
+
+next_rule:	/* s points to '=' */
+		s = strchr(s, ';');
+		if (!s)
+			break;
+	}
+
+	return ERR_PTR(-ENOENT);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCHv2 3/4] mm: cma: Test device and application added
@ 2010-07-26 14:11       ` Michal Nazarewicz
  0 siblings, 0 replies; 57+ messages in thread
From: Michal Nazarewicz @ 2010-07-26 14:40 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds a "cma" misc device which lets user space use the
CMA API.  This device is meant for testing.  A testing application
is also provided.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/misc/Kconfig   |    8 +
 drivers/misc/Makefile  |    1 +
 drivers/misc/cma-dev.c |  184 ++++++++++++++++++++++++
 include/linux/cma.h    |   30 ++++
 tools/cma/cma-test.c   |  373 ++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 596 insertions(+), 0 deletions(-)
 create mode 100644 drivers/misc/cma-dev.c
 create mode 100644 tools/cma/cma-test.c

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 9b089df..6ae3d9f 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -368,4 +368,12 @@ source "drivers/misc/eeprom/Kconfig"
 source "drivers/misc/cb710/Kconfig"
 source "drivers/misc/iwmc3200top/Kconfig"
 
+config CMA_DEVICE
+	tristate "CMA misc device (DEVELOPEMENT)"
+	depends on CMA
+	help
+	  The CMA misc device allows allocating contiguous memory areas
+	  from user space.  This is mostly for testing of the CMA
+	  framework.
+
 endif # MISC_DEVICES
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 67552d6..9921370 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -32,3 +32,4 @@ obj-y				+= eeprom/
 obj-y				+= cb710/
 obj-$(CONFIG_VMWARE_BALLOON)	+= vmware_balloon.o
 obj-$(CONFIG_ARM_CHARLCD)	+= arm-charlcd.o
+obj-$(CONFIG_CMA_DEVICE)	+= cma-dev.o
diff --git a/drivers/misc/cma-dev.c b/drivers/misc/cma-dev.c
new file mode 100644
index 0000000..7d7bc05
--- /dev/null
+++ b/drivers/misc/cma-dev.c
@@ -0,0 +1,184 @@
+/*
+ * Contiguous Memory Allocator userspace driver
+ * Copyright (c) 2010 by Samsung Electronics.
+ * Written by Michal Nazarewicz (m.nazarewicz at samsung.com)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License or (at your optional) any later version of the license.
+ */
+
+#define pr_fmt(fmt) "cma: " fmt
+
+#ifdef CONFIG_CMA_DEBUG
+#  define DEBUG
+#endif
+
+#include <linux/errno.h>       /* Error numbers */
+#include <linux/err.h>         /* IS_ERR_VALUE() */
+#include <linux/fs.h>          /* struct file */
+#include <linux/mm.h>          /* Memory stuff */
+#include <linux/mman.h>
+#include <linux/slab.h>
+#include <linux/module.h>      /* Standard module stuff */
+#include <linux/device.h>      /* struct device, dev_dbg() */
+#include <linux/types.h>       /* Just to be safe ;) */
+#include <linux/uaccess.h>     /* __copy_{to,from}_user */
+#include <linux/miscdevice.h>  /* misc_register() and company */
+
+#include <linux/cma.h>
+
+static int  cma_file_open(struct inode *inode, struct file *file);
+static int  cma_file_release(struct inode *inode, struct file *file);
+static long cma_file_ioctl(struct file *file, unsigned cmd, unsigned long arg);
+static int  cma_file_mmap(struct file *file, struct vm_area_struct *vma);
+
+
+static struct miscdevice cma_miscdev = {
+	.minor = MISC_DYNAMIC_MINOR,
+	.name  = "cma",
+	.fops  = &(const struct file_operations) {
+		.owner          = THIS_MODULE,
+		.open           = cma_file_open,
+		.release        = cma_file_release,
+		.unlocked_ioctl = cma_file_ioctl,
+		.mmap           = cma_file_mmap,
+	},
+};
+#define cma_dev (cma_miscdev.this_device)
+
+
+#define cma_file_start(file) (((dma_addr_t *)(file)->private_data)[0])
+#define cma_file_size(file)  (((dma_addr_t *)(file)->private_data)[1])
+
+
+static int  cma_file_open(struct inode *inode, struct file *file)
+{
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	file->private_data = NULL;
+
+	return 0;
+}
+
+
+static int  cma_file_release(struct inode *inode, struct file *file)
+{
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	if (file->private_data) {
+		cma_put(cma_file_start(file));
+		kfree(file->private_data);
+	}
+
+	return 0;
+}
+
+
+static long cma_file_ioctl(struct file *file, unsigned cmd, unsigned long arg)
+{
+	struct cma_alloc_request req;
+	struct device fake_device;
+	unsigned long addr;
+	long ret;
+
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	if (cmd != IOCTL_CMA_ALLOC)
+		return -ENOTTY;
+
+	if (!arg)
+		return -EINVAL;
+
+	if (file->private_data) /* Already allocated */
+		return -EBADFD;
+
+	if (copy_from_user(&req, (void *)arg, sizeof req))
+		return -EFAULT;
+
+	if (req.magic != CMA_MAGIC)
+		return -ENOTTY;
+
+	/* May happen on 32 bit system. */
+	if (req.size > ~(typeof(req.size))0 ||
+	    req.alignment > ~(typeof(req.alignment))0)
+		return -EINVAL;
+
+	if (strnlen(req.name, sizeof req.name) >= sizeof req.name
+	 || strnlen(req.kind, sizeof req.kind) >= sizeof req.kind)
+		return -EINVAL;
+
+	file->private_data = kmalloc(2 * sizeof(dma_addr_t), GFP_KERNEL);
+	if (!file->private_data)
+		return -ENOMEM;
+
+	fake_device.init_name = req.name;
+	fake_device.kobj.name = req.name;
+	addr = cma_alloc(&fake_device, req.kind, req.size, req.alignment);
+	if (IS_ERR_VALUE(addr)) {
+		ret = addr;
+		goto error_priv;
+	}
+
+	if (put_user(addr, (typeof(req.start) *)(arg + offsetof(typeof(req), start)))) {
+		ret = -EFAULT;
+		goto error_put;
+	}
+
+	cma_file_start(file) = addr;
+	cma_file_size(file) = req.size;
+
+	dev_dbg(cma_dev, "allocated %p@%p\n",
+		(void *)(dma_addr_t)req.size, (void *)addr);
+
+	return 0;
+
+error_put:
+	cma_put(addr);
+error_priv:
+	kfree(file->private_data);
+	file->private_data = NULL;
+	return ret;
+}
+
+
+static int  cma_file_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	unsigned long pgoff, offset, length;
+
+	dev_dbg(cma_dev, "%s(%p)\n", __func__, (void *)file);
+
+	if (!file->private_data)
+		return -EBADFD;
+
+	pgoff  = vma->vm_pgoff;
+	offset = pgoff << PAGE_SHIFT;
+	length = vma->vm_end - vma->vm_start;
+
+	if (offset          >= cma_file_size(file)
+	 || length          >  cma_file_size(file)
+	 || offset + length >  cma_file_size(file))
+		return -ENOSPC;
+
+	return remap_pfn_range(vma, vma->vm_start,
+			       __phys_to_pfn(cma_file_start(file) + offset),
+			       length, vma->vm_page_prot);
+}
+
+
+
+static int __init cma_dev_init(void)
+{
+	int ret = misc_register(&cma_miscdev);
+	pr_debug("miscdev: register returned: %d\n", ret);
+	return ret;
+}
+module_init(cma_dev_init);
+
+static void __exit cma_dev_exit(void)
+{
+	dev_dbg(cma_dev, "deregisterring\n");
+	misc_deregister(&cma_miscdev);
+}
+module_exit(cma_dev_exit);
diff --git a/include/linux/cma.h b/include/linux/cma.h
index 96a8652..36e73fb 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -11,6 +11,36 @@
  * See Documentation/contiguous-memory.txt for details.
  */
 
+#include <linux/ioctl.h>
+#include <linux/types.h>
+
+
+#define CMA_MAGIC (('c' << 24) | ('M' << 16) | ('a' << 8) | 0x42)
+
+/**
+ * An information about area exportable to user space.
+ * @magic: must always be CMA_MAGIC.
+ * @name:  name of the device to allocate as.
+ * @kind:  kind of the memory.
+ * @_pad:  reserved.
+ * @size:  size of the chunk to allocate.
+ * @alignment: desired alignment of the chunk (must be power of two or zero).
+ * @start: when ioctl() finishes this stores physical address of the chunk.
+ */
+struct cma_alloc_request {
+	__u32 magic;
+	char  name[17];
+	char  kind[17];
+	__u16 pad;
+	/* __u64 to be compatible accross 32 and 64 bit systems. */
+	__u64 size;
+	__u64 alignment;
+	__u64 start;
+};
+
+#define IOCTL_CMA_ALLOC    _IOWR('p', 0, struct cma_alloc_request)
+
+
 /***************************** Kernel lever API *****************************/
 
 #ifdef __KERNEL__
diff --git a/tools/cma/cma-test.c b/tools/cma/cma-test.c
new file mode 100644
index 0000000..567c57b
--- /dev/null
+++ b/tools/cma/cma-test.c
@@ -0,0 +1,373 @@
+/*
+ * cma-test.c -- CMA testing application
+ *
+ * Copyright (C) 2010 Samsung Electronics
+ *                    Author: Michal Nazarewicz <m.nazarewicz@samsung.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+/* $(CROSS_COMPILE)gcc -Wall -Wextra -g -o cma-test cma-test.c  */
+
+#include <linux/cma.h>
+
+#include <sys/ioctl.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+
+#include <fcntl.h>
+#include <unistd.h>
+
+#include <ctype.h>
+#include <errno.h>
+#include <limits.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+
+static void handle_command(char *line);
+
+int main(void)
+{
+	unsigned no = 1;
+	char line[1024];
+	int skip = 0;
+
+	fputs("commands:\n"
+	      " l or list                                list allocated chunks\n"
+	      " a or alloc  <name> <size>[/<alignment>]  allocate chunk\n"
+	      " f or free   [<num>]                      free an chunk\n"
+	      " # ...                                    comment\n"
+	      " <empty line>                             repeat previous\n"
+	      "\n", stderr);
+
+	while (fgets(line, sizeof line, stdin)) {
+		char *nl = strchr(line, '\n');
+		if (nl) {
+			if (skip) {
+				fprintf(stderr, "cma: %d: line too long\n", no);
+				skip = 0;
+			} else {
+				*nl = '\0';
+				handle_command(line);
+			}
+			++no;
+		} else {
+			skip = 1;
+		}
+	}
+
+	if (skip)
+		fprintf(stderr, "cma: %d: no new line at EOF\n", no);
+	return 0;
+}
+
+
+
+static void cmd_list(char *name, char *line);
+static void cmd_alloc(char *name, char *line);
+static void cmd_free(char *name, char *line);
+
+static const struct command {
+	const char name[8];
+	void (*handle)(char *name, char *line);
+} commands[] = {
+	{ "list",  cmd_list  },
+	{ "l",     cmd_list  },
+	{ "alloc", cmd_alloc },
+	{ "a",     cmd_alloc },
+	{ "free",  cmd_free  },
+	{ "f",     cmd_free  },
+	{ "",      NULL      }
+};
+
+
+#define SKIP_SPACE(ch) do while (isspace(*(ch))) ++(ch); while (0)
+
+
+static void handle_command(char *line)
+{
+	static char last_line[1024];
+
+	const struct command *cmd;
+	char *name;
+
+	SKIP_SPACE(line);
+	if (*line == '#')
+		return;
+
+	if (!*line)
+		strcpy(line, last_line);
+	else
+		strcpy(last_line, line);
+
+	name = line;
+	while (*line && !isspace(*line))
+		++line;
+
+	if (*line) {
+		*line = '\0';
+		++line;
+	}
+
+	for (cmd = commands; *(cmd->name); ++cmd)
+		if (!strcmp(name, cmd->name)) {
+			cmd->handle(name, line);
+			return;
+		}
+
+	fprintf(stderr, "%s: unknown command\n", name);
+}
+
+
+
+struct chunk {
+	struct chunk *next, *prev;
+	int fd;
+	unsigned long size;
+	unsigned long start;
+};
+
+static struct chunk root = {
+	.next = &root,
+	.prev = &root,
+};
+
+#define for_each(a) for (a = root.next; a != &root; a = a->next)
+
+static struct chunk *chunk_create(const char *prefix);
+static void chunk_destroy(struct chunk *chunk);
+static void chunk_add(struct chunk *chunk);
+
+static int memparse(char *ptr, char **retptr, unsigned long *ret);
+
+
+static void cmd_list(char *name, char *line)
+{
+	struct chunk *chunk;
+
+	(void)name; (void)line;
+
+	for_each(chunk)
+		printf("%3d: %p@%p\n", chunk->fd,
+		       (void *)chunk->size, (void *)chunk->start);
+}
+
+
+static void cmd_alloc(char *name, char *line)
+{
+	unsigned long size, alignment = 0;
+	struct cma_alloc_request req;
+	char *dev, *kind = NULL;
+	struct chunk *chunk;
+	int ret;
+
+	SKIP_SPACE(line);
+	if (!*line) {
+		fprintf(stderr, "%s: expecting name\n", name);
+		return;
+	}
+
+	for (dev = line; *line && !isspace(*line); ++line)
+		if (*line == '/')
+			kind = line;
+
+	if (!*line) {
+		fprintf(stderr, "%s: expecting size after name\n", name);
+		return;
+	}
+
+	if (kind)
+		*kind++ = '\0';
+	*line++ = '\0';
+
+	if (( kind && (size_t)(kind - dev ) > sizeof req.name)
+	 || (!kind && (size_t)(line - dev ) > sizeof req.name)
+	 || ( kind && (size_t)(line - kind) > sizeof req.kind)) {
+		fprintf(stderr, "%s: name or kind too long\n", name);
+		return;
+	}
+
+
+	if (memparse(line, &line, &size) < 0 || !size) {
+		fprintf(stderr, "%s: invalid size\n", name);
+		return;
+	}
+
+	if (*line == '/')
+		if (memparse(line, &line, &alignment) < 0) {
+			fprintf(stderr, "%s: invalid alignment\n", name);
+			return;
+		}
+
+	SKIP_SPACE(line);
+	if (*line) {
+		fprintf(stderr, "%s: unknown arguments@the end: %s\n",
+			name, line);
+		return;
+	}
+
+
+	chunk = chunk_create(name);
+	if (!chunk)
+		return;
+
+	fprintf(stderr, "%s: allocating %p/%p\n", name,
+		(void *)size, (void *)alignment);
+
+	req.magic     = CMA_MAGIC;
+	req.size      = size;
+	req.alignment = alignment;
+
+	strcpy(req.name, dev);
+	if (kind)
+		strcpy(req.kind, kind);
+	else
+		req.kind[0] = '\0';
+
+
+	ret = ioctl(chunk->fd, IOCTL_CMA_ALLOC, &req);
+	if (ret < 0) {
+		fprintf(stderr, "%s: cma_alloc: %s\n", name, strerror(errno));
+		chunk_destroy(chunk);
+	} else {
+		chunk_add(chunk);
+		chunk->size  = req.size;
+		chunk->start = req.start;
+
+		printf("%3d: %p@%p\n", chunk->fd,
+		       (void *)chunk->size, (void *)chunk->start);
+	}
+}
+
+
+static void cmd_free(char *name, char *line)
+{
+	struct chunk *chunk;
+
+	SKIP_SPACE(line);
+
+	if (*line) {
+		unsigned long num;
+
+		errno = 0;
+		num = strtoul(line, &line, 10);
+
+		if (errno || num > INT_MAX) {
+			fprintf(stderr, "%s: invalid number\n", name);
+			return;
+		}
+
+		SKIP_SPACE(line);
+		if (*line) {
+			fprintf(stderr, "%s: unknown arguments at the end: %s\n",
+				name, line);
+			return;
+		}
+
+		for_each(chunk)
+			if (chunk->fd == (int)num)
+				goto ok;
+		fprintf(stderr, "%s: no chunk %3lu\n", name, num);
+		return;
+
+	} else {
+		chunk = root.prev;
+		if (chunk == &root) {
+			fprintf(stderr, "%s: no chunks\n", name);
+			return;
+		}
+	}
+
+ok:
+	fprintf(stderr, "%s: freeing %p@%p\n", name,
+		(void *)chunk->size, (void *)chunk->start);
+	chunk_destroy(chunk);
+}
+
+
+static struct chunk *chunk_create(const char *prefix)
+{
+	struct chunk *chunk;
+	int fd;
+
+	chunk = malloc(sizeof *chunk);
+	if (!chunk) {
+		fprintf(stderr, "%s: %s\n", prefix, strerror(errno));
+		return NULL;
+	}
+
+	fd = open("/dev/cma", O_RDWR);
+	if (fd < 0) {
+		fprintf(stderr, "%s: /dev/cma: %s\n", prefix, strerror(errno));
+		return NULL;
+	}
+
+	chunk->prev = chunk;
+	chunk->next = chunk;
+	chunk->fd   = fd;
+	return chunk;
+}
+
+static void chunk_destroy(struct chunk *chunk)
+{
+	chunk->prev->next = chunk->next;
+	chunk->next->prev = chunk->prev;
+	close(chunk->fd);
+}
+
+static void chunk_add(struct chunk *chunk)
+{
+	chunk->next = &root;
+	chunk->prev = root.prev;
+	root.prev->next = chunk;
+	root.prev = chunk;
+}
+
+
+
+static int memparse(char *ptr, char **retptr, unsigned long *ret)
+{
+	unsigned long val;
+
+	SKIP_SPACE(ptr);
+
+	errno = 0;
+	val = strtoul(ptr, &ptr, 0);
+	if (errno)
+		return -1;
+
+	switch (*ptr) {
+	case 'G':
+	case 'g':
+		val <<= 10;
+	case 'M':
+	case 'm':
+		val <<= 10;
+	case 'K':
+	case 'k':
+		val <<= 10;
+		++ptr;
+	}
+
+	if (retptr) {
+		SKIP_SPACE(ptr);
+		*retptr = ptr;
+	}
+
+	*ret = val;
+	return 0;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCHv2 4/4] arm: Added CMA to Aquila and Goni
@ 2010-07-26 14:11         ` Michal Nazarewicz
  0 siblings, 0 replies; 57+ messages in thread
From: Michal Nazarewicz @ 2010-07-26 14:40 UTC (permalink / raw)
  To: linux-arm-kernel

Added the CMA initialisation code to two Samsung platforms.

Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 arch/arm/mach-s5pv210/mach-aquila.c |   13 +++++++++++++
 arch/arm/mach-s5pv210/mach-goni.c   |   13 +++++++++++++
 2 files changed, 26 insertions(+), 0 deletions(-)

diff --git a/arch/arm/mach-s5pv210/mach-aquila.c b/arch/arm/mach-s5pv210/mach-aquila.c
index 0992618..ab156f9 100644
--- a/arch/arm/mach-s5pv210/mach-aquila.c
+++ b/arch/arm/mach-s5pv210/mach-aquila.c
@@ -19,6 +19,7 @@
 #include <linux/gpio_keys.h>
 #include <linux/input.h>
 #include <linux/gpio.h>
+#include <linux/cma.h>
 
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
@@ -454,6 +455,17 @@ static void __init aquila_map_io(void)
 	s3c24xx_init_uarts(aquila_uartcfgs, ARRAY_SIZE(aquila_uartcfgs));
 }
 
+static void __init aquila_reserve(void)
+{
+	static char regions[] __initdata =
+		"-mfc_fw=1M/128K;mfc_b1=32M;mfc_b2=16M at 0x40000000";
+	static char map[] __initdata =
+		"s3c-mfc5/f=mfc_fw;s3c-mfc5/a=mfc_b1;s3c-mfc5/b=mfc_b2";
+
+	cma_set_defaults(regions, map, NULL);
+	cma_early_regions_reserve(NULL);
+}
+
 static void __init aquila_machine_init(void)
 {
 	/* PMIC */
@@ -478,4 +490,5 @@ MACHINE_START(AQUILA, "Aquila")
 	.map_io		= aquila_map_io,
 	.init_machine	= aquila_machine_init,
 	.timer		= &s3c24xx_timer,
+	.reserve	= aquila_reserve,
 MACHINE_END
diff --git a/arch/arm/mach-s5pv210/mach-goni.c b/arch/arm/mach-s5pv210/mach-goni.c
index 7b18505..2b0a349 100644
--- a/arch/arm/mach-s5pv210/mach-goni.c
+++ b/arch/arm/mach-s5pv210/mach-goni.c
@@ -19,6 +19,7 @@
 #include <linux/gpio_keys.h>
 #include <linux/input.h>
 #include <linux/gpio.h>
+#include <linux/cma.h>
 
 #include <asm/mach/arch.h>
 #include <asm/mach/map.h>
@@ -435,6 +436,17 @@ static void __init goni_map_io(void)
 	s3c24xx_init_uarts(goni_uartcfgs, ARRAY_SIZE(goni_uartcfgs));
 }
 
+static void __init goni_reserve(void)
+{
+	static char regions[] __initdata =
+		"-mfc_fw=1M/128K;mfc_b1=32M;mfc_b2=16M at 0x40000000";
+	static char map[] __initdata =
+		"s3c-mfc5/f=mfc_fw;s3c-mfc5/a=mfc_b1;s3c-mfc5/b=mfc_b2";
+
+	cma_set_defaults(regions, map, NULL);
+	cma_early_regions_reserve(NULL);
+}
+
 static void __init goni_machine_init(void)
 {
 	/* PMIC */
@@ -456,4 +468,5 @@ MACHINE_START(GONI, "GONI")
 	.map_io		= goni_map_io,
 	.init_machine	= goni_machine_init,
 	.timer		= &s3c24xx_timer,
+	.reserve	= goni_reserve,
 MACHINE_END
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
  2010-07-26 14:11     ` Michal Nazarewicz
  (?)
@ 2010-07-26 20:28       ` Hans Verkuil
  -1 siblings, 0 replies; 57+ messages in thread
From: Hans Verkuil @ 2010-07-26 20:28 UTC (permalink / raw)
  To: Michal Nazarewicz
  Cc: linux-mm, linux-kernel, linux-media, linux-arm-kernel,
	Hiremath Vaibhav, Marek Szyprowski, Pawel Osciak, Mark Brown,
	Daniel Walker, Jonathan Corbet, FUJITA Tomonori, Zach Pfeffer,
	Kyungmin Park

Hi Michal,

Thanks for working on this, we definitely need something along these lines.

On Monday 26 July 2010 16:40:30 Michal Nazarewicz wrote:
> The Contiguous Memory Allocator framework is a set of APIs for
> allocating physically contiguous chunks of memory.
> 
> Various chips require contiguous blocks of memory to operate.  Those
> chips include devices such as cameras, hardware video decoders and
> encoders, etc.
> 
> The code is highly modular and customisable to suit the needs of
> various users.  Set of regions reserved for CMA can be configured on
> run-time and it is easy to add custom allocator algorithms if one
> has such need.
> 
> For more details see Documentation/contiguous-memory.txt.
> 
> Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> Reviewed-by: Pawel Osciak <p.osciak@samsung.com>
> ---
>  Documentation/00-INDEX                             |    2 +
>  .../ABI/testing/sysfs-kernel-mm-contiguous         |    9 +
>  Documentation/contiguous-memory.txt                |  646 +++++++++++
>  Documentation/kernel-parameters.txt                |    4 +
>  include/linux/cma.h                                |  445 ++++++++
>  mm/Kconfig                                         |   34 +
>  mm/Makefile                                        |    3 +
>  mm/cma-best-fit.c                                  |  407 +++++++
>  mm/cma.c                                           | 1170 ++++++++++++++++++++
>  9 files changed, 2720 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous
>  create mode 100644 Documentation/contiguous-memory.txt
>  create mode 100644 include/linux/cma.h
>  create mode 100644 mm/cma-best-fit.c
>  create mode 100644 mm/cma.c
> 
> diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
> index 5405f7a..bb50209 100644
> --- a/Documentation/00-INDEX
> +++ b/Documentation/00-INDEX
> @@ -94,6 +94,8 @@ connector/
>  	- docs on the netlink based userspace<->kernel space communication mod.
>  console/
>  	- documentation on Linux console drivers.
> +contiguous-memory.txt
> +	- documentation on physically-contiguous memory allocation framework.
>  cpu-freq/
>  	- info on CPU frequency and voltage scaling.
>  cpu-hotplug.txt
> diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-contiguous b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> new file mode 100644
> index 0000000..05e2f6a
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> @@ -0,0 +1,9 @@
> +What:		/sys/kernel/mm/contiguous/
> +Date:		July 2008
> +Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
> +Description:
> +		/sys/kernel/mm/contiguous/ contains two files: asterisk and
> +		map.  They are used to configure the Contiguous Memory
> +		Allocator framework.
> +
> +		For details see Documentation/contiguous-memory.txt.
> diff --git a/Documentation/contiguous-memory.txt b/Documentation/contiguous-memory.txt
> new file mode 100644
> index 0000000..6eb1295
> --- /dev/null
> +++ b/Documentation/contiguous-memory.txt
> @@ -0,0 +1,646 @@
> +                                                             -*- org -*-
> +
> +* Contiguous Memory Allocator
> +
> +   The Contiguous Memory Allocator (CMA) is a framework, which allows
> +   setting up a machine-specific configuration for physically-contiguous
> +   memory management. Memory for devices is then allocated according
> +   to that configuration.
> +
> +   The main role of the framework is not to allocate memory, but to
> +   parse and manage memory configurations, as well as to act as an
> +   in-between between device drivers and pluggable allocators. It is
> +   thus not tied to any memory allocation method or strategy.
> +
> +** Why is it needed?
> +
> +    Various devices on embedded systems have no scatter-getter and/or
> +    IO map support and as such require contiguous blocks of memory to
> +    operate.  They include devices such as cameras, hardware video
> +    decoders and encoders, etc.
> +
> +    Such devices often require big memory buffers (a full HD frame is,
> +    for instance, more then 2 mega pixels large, i.e. more than 6 MB
> +    of memory), which makes mechanisms such as kmalloc() ineffective.
> +
> +    Some embedded devices impose additional requirements on the
> +    buffers, e.g. they can operate only on buffers allocated in
> +    particular location/memory bank (if system has more than one
> +    memory bank) or buffers aligned to a particular memory boundary.
> +
> +    Development of embedded devices have seen a big rise recently
> +    (especially in the V4L area) and many such drivers include their
> +    own memory allocation code. Most of them use bootmem-based methods.
> +    CMA framework is an attempt to unify contiguous memory allocation
> +    mechanisms and provide a simple API for device drivers, while
> +    staying as customisable and modular as possible.
> +
> +** Design
> +
> +    The main design goal for the CMA was to provide a customisable and
> +    modular framework, which could be configured to suit the needs of
> +    individual systems.  Configuration specifies a list of memory
> +    regions, which then are assigned to devices.  Memory regions can
> +    be shared among many device drivers or assigned exclusively to
> +    one.  This has been achieved in the following ways:

OK, I like the idea of regions, i.e. defining memory areas with specific
properties or uses.

But why should it be possible to define regions through kernel parameters?
Regions are typically fixed for a particular platform and can be setup in the
platform specific code. Actually, one region could be setup by default:
DMA-able memory. That would be very handy in fact for many PCI-based TV
capture drivers.

I think that the only thing that you want to set in the kernel params is the
size of each region.

The same with assigning regions to drivers: why would you want to do that?
The driver should know which regions it can use (with possible fallbacks).
And it can know that provided regions are setup by the platform code and not
created dynamically. This will simplify things enormously.

> +    1. The core of the CMA does not handle allocation of memory and
> +       management of free space.  Dedicated allocators are used for
> +       that purpose.
> +
> +       This way, if the provided solution does not match demands
> +       imposed on a given system, one can develop a new algorithm and
> +       easily plug it into the CMA framework.
> +
> +       The presented solution includes an implementation of a best-fit
> +       algorithm.

Again, do we really need user-settable per-region allocators? Just provide
one with the option to later choose others through the kernel Kconfig files.

We can always add more complex scenarios later, but for an initial version
I'd keep it simple.

> +
> +    2. CMA allows a run-time configuration of the memory regions it
> +       will use to allocate chunks of memory from.  The set of memory
> +       regions is given on command line so it can be easily changed
> +       without the need for recompiling the kernel.
> +
> +       Each region has it's own size, alignment demand, a start
> +       address (physical address where it should be placed) and an
> +       allocator algorithm assigned to the region.
> +
> +       This means that there can be different algorithms running at
> +       the same time, if different devices on the platform have
> +       distinct memory usage characteristics and different algorithm
> +       match those the best way.

Seems overengineering to me. Just ensure that the code can be extended later
to such hypothetical scenarios. They are hypothetical, right?

> +    3. When requesting memory, devices have to introduce themselves.
> +       This way CMA knows who the memory is allocated for.  This
> +       allows for the system architect to specify which memory regions
> +       each device should use.
> +
> +       3a. Devices can also specify a "kind" of memory they want.
> +           This makes it possible to configure the system in such
> +           a way, that a single device may get memory from different
> +           memory regions, depending on the "kind" of memory it
> +           requested.  For example, a video codec driver might want to
> +           allocate some shared buffers from the first memory bank and
> +           the other from the second to get the highest possible
> +           memory throughput.

Not sure I understand this. Isn't this just two regions, one for each memory bank,
and the driver requests some buffers from one region and some from the other?
Not sure how a 'kind of memory' features in this.

> +    4. For greater flexibility and extensibility, the framework allows
> +       device drivers to register private regions of reserved memory
> +       which then may be used only by them.
> +
> +       As an effect, if a driver would not use the rest of the CMA
> +       interface, it can still use CMA allocators and other
> +       mechanisms.

Why would you? Is there an actual driver that will need this?

> +
> +       4a. Early in boot process, device drivers can also request the
> +           CMA framework to a reserve a region of memory for them
> +           which then will be used as a private region.
> +
> +           This way, drivers do not need to directly call bootmem,
> +           memblock or similar early allocator but merely register an
> +           early region and the framework will handle the rest
> +           including choosing the right early allocator.

The whole concept of private regions seems unnecessary to me.

<big snip>

It looks to me as if you tried to think of all possible hypothetical situations
and write a framework for that. Of course, you may know more than I do, and some
of these situations actually happen.

The basic design ideas are solid, I think. But you should get rid of all the
fancy features and go back to basics. We can always add those features later
should that become necessary. But removing features is much, much harder.

Regards,

	Hans

-- 
Hans Verkuil - video4linux developer - sponsored by TANDBERG, part of Cisco

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-26 20:28       ` Hans Verkuil
  0 siblings, 0 replies; 57+ messages in thread
From: Hans Verkuil @ 2010-07-26 20:28 UTC (permalink / raw)
  To: Michal Nazarewicz
  Cc: linux-mm, linux-kernel, linux-media, linux-arm-kernel,
	Hiremath Vaibhav, Marek Szyprowski, Pawel Osciak, Mark Brown,
	Daniel Walker, Jonathan Corbet, FUJITA Tomonori, Zach Pfeffer,
	Kyungmin Park

Hi Michal,

Thanks for working on this, we definitely need something along these lines.

On Monday 26 July 2010 16:40:30 Michal Nazarewicz wrote:
> The Contiguous Memory Allocator framework is a set of APIs for
> allocating physically contiguous chunks of memory.
> 
> Various chips require contiguous blocks of memory to operate.  Those
> chips include devices such as cameras, hardware video decoders and
> encoders, etc.
> 
> The code is highly modular and customisable to suit the needs of
> various users.  Set of regions reserved for CMA can be configured on
> run-time and it is easy to add custom allocator algorithms if one
> has such need.
> 
> For more details see Documentation/contiguous-memory.txt.
> 
> Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> Reviewed-by: Pawel Osciak <p.osciak@samsung.com>
> ---
>  Documentation/00-INDEX                             |    2 +
>  .../ABI/testing/sysfs-kernel-mm-contiguous         |    9 +
>  Documentation/contiguous-memory.txt                |  646 +++++++++++
>  Documentation/kernel-parameters.txt                |    4 +
>  include/linux/cma.h                                |  445 ++++++++
>  mm/Kconfig                                         |   34 +
>  mm/Makefile                                        |    3 +
>  mm/cma-best-fit.c                                  |  407 +++++++
>  mm/cma.c                                           | 1170 ++++++++++++++++++++
>  9 files changed, 2720 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous
>  create mode 100644 Documentation/contiguous-memory.txt
>  create mode 100644 include/linux/cma.h
>  create mode 100644 mm/cma-best-fit.c
>  create mode 100644 mm/cma.c
> 
> diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
> index 5405f7a..bb50209 100644
> --- a/Documentation/00-INDEX
> +++ b/Documentation/00-INDEX
> @@ -94,6 +94,8 @@ connector/
>  	- docs on the netlink based userspace<->kernel space communication mod.
>  console/
>  	- documentation on Linux console drivers.
> +contiguous-memory.txt
> +	- documentation on physically-contiguous memory allocation framework.
>  cpu-freq/
>  	- info on CPU frequency and voltage scaling.
>  cpu-hotplug.txt
> diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-contiguous b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> new file mode 100644
> index 0000000..05e2f6a
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> @@ -0,0 +1,9 @@
> +What:		/sys/kernel/mm/contiguous/
> +Date:		July 2008
> +Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
> +Description:
> +		/sys/kernel/mm/contiguous/ contains two files: asterisk and
> +		map.  They are used to configure the Contiguous Memory
> +		Allocator framework.
> +
> +		For details see Documentation/contiguous-memory.txt.
> diff --git a/Documentation/contiguous-memory.txt b/Documentation/contiguous-memory.txt
> new file mode 100644
> index 0000000..6eb1295
> --- /dev/null
> +++ b/Documentation/contiguous-memory.txt
> @@ -0,0 +1,646 @@
> +                                                             -*- org -*-
> +
> +* Contiguous Memory Allocator
> +
> +   The Contiguous Memory Allocator (CMA) is a framework, which allows
> +   setting up a machine-specific configuration for physically-contiguous
> +   memory management. Memory for devices is then allocated according
> +   to that configuration.
> +
> +   The main role of the framework is not to allocate memory, but to
> +   parse and manage memory configurations, as well as to act as an
> +   in-between between device drivers and pluggable allocators. It is
> +   thus not tied to any memory allocation method or strategy.
> +
> +** Why is it needed?
> +
> +    Various devices on embedded systems have no scatter-getter and/or
> +    IO map support and as such require contiguous blocks of memory to
> +    operate.  They include devices such as cameras, hardware video
> +    decoders and encoders, etc.
> +
> +    Such devices often require big memory buffers (a full HD frame is,
> +    for instance, more then 2 mega pixels large, i.e. more than 6 MB
> +    of memory), which makes mechanisms such as kmalloc() ineffective.
> +
> +    Some embedded devices impose additional requirements on the
> +    buffers, e.g. they can operate only on buffers allocated in
> +    particular location/memory bank (if system has more than one
> +    memory bank) or buffers aligned to a particular memory boundary.
> +
> +    Development of embedded devices have seen a big rise recently
> +    (especially in the V4L area) and many such drivers include their
> +    own memory allocation code. Most of them use bootmem-based methods.
> +    CMA framework is an attempt to unify contiguous memory allocation
> +    mechanisms and provide a simple API for device drivers, while
> +    staying as customisable and modular as possible.
> +
> +** Design
> +
> +    The main design goal for the CMA was to provide a customisable and
> +    modular framework, which could be configured to suit the needs of
> +    individual systems.  Configuration specifies a list of memory
> +    regions, which then are assigned to devices.  Memory regions can
> +    be shared among many device drivers or assigned exclusively to
> +    one.  This has been achieved in the following ways:

OK, I like the idea of regions, i.e. defining memory areas with specific
properties or uses.

But why should it be possible to define regions through kernel parameters?
Regions are typically fixed for a particular platform and can be setup in the
platform specific code. Actually, one region could be setup by default:
DMA-able memory. That would be very handy in fact for many PCI-based TV
capture drivers.

I think that the only thing that you want to set in the kernel params is the
size of each region.

The same with assigning regions to drivers: why would you want to do that?
The driver should know which regions it can use (with possible fallbacks).
And it can know that provided regions are setup by the platform code and not
created dynamically. This will simplify things enormously.

> +    1. The core of the CMA does not handle allocation of memory and
> +       management of free space.  Dedicated allocators are used for
> +       that purpose.
> +
> +       This way, if the provided solution does not match demands
> +       imposed on a given system, one can develop a new algorithm and
> +       easily plug it into the CMA framework.
> +
> +       The presented solution includes an implementation of a best-fit
> +       algorithm.

Again, do we really need user-settable per-region allocators? Just provide
one with the option to later choose others through the kernel Kconfig files.

We can always add more complex scenarios later, but for an initial version
I'd keep it simple.

> +
> +    2. CMA allows a run-time configuration of the memory regions it
> +       will use to allocate chunks of memory from.  The set of memory
> +       regions is given on command line so it can be easily changed
> +       without the need for recompiling the kernel.
> +
> +       Each region has it's own size, alignment demand, a start
> +       address (physical address where it should be placed) and an
> +       allocator algorithm assigned to the region.
> +
> +       This means that there can be different algorithms running at
> +       the same time, if different devices on the platform have
> +       distinct memory usage characteristics and different algorithm
> +       match those the best way.

Seems overengineering to me. Just ensure that the code can be extended later
to such hypothetical scenarios. They are hypothetical, right?

> +    3. When requesting memory, devices have to introduce themselves.
> +       This way CMA knows who the memory is allocated for.  This
> +       allows for the system architect to specify which memory regions
> +       each device should use.
> +
> +       3a. Devices can also specify a "kind" of memory they want.
> +           This makes it possible to configure the system in such
> +           a way, that a single device may get memory from different
> +           memory regions, depending on the "kind" of memory it
> +           requested.  For example, a video codec driver might want to
> +           allocate some shared buffers from the first memory bank and
> +           the other from the second to get the highest possible
> +           memory throughput.

Not sure I understand this. Isn't this just two regions, one for each memory bank,
and the driver requests some buffers from one region and some from the other?
Not sure how a 'kind of memory' features in this.

> +    4. For greater flexibility and extensibility, the framework allows
> +       device drivers to register private regions of reserved memory
> +       which then may be used only by them.
> +
> +       As an effect, if a driver would not use the rest of the CMA
> +       interface, it can still use CMA allocators and other
> +       mechanisms.

Why would you? Is there an actual driver that will need this?

> +
> +       4a. Early in boot process, device drivers can also request the
> +           CMA framework to a reserve a region of memory for them
> +           which then will be used as a private region.
> +
> +           This way, drivers do not need to directly call bootmem,
> +           memblock or similar early allocator but merely register an
> +           early region and the framework will handle the rest
> +           including choosing the right early allocator.

The whole concept of private regions seems unnecessary to me.

<big snip>

It looks to me as if you tried to think of all possible hypothetical situations
and write a framework for that. Of course, you may know more than I do, and some
of these situations actually happen.

The basic design ideas are solid, I think. But you should get rid of all the
fancy features and go back to basics. We can always add those features later
should that become necessary. But removing features is much, much harder.

Regards,

	Hans

-- 
Hans Verkuil - video4linux developer - sponsored by TANDBERG, part of Cisco

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-26 20:28       ` Hans Verkuil
  0 siblings, 0 replies; 57+ messages in thread
From: Hans Verkuil @ 2010-07-26 20:28 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Michal,

Thanks for working on this, we definitely need something along these lines.

On Monday 26 July 2010 16:40:30 Michal Nazarewicz wrote:
> The Contiguous Memory Allocator framework is a set of APIs for
> allocating physically contiguous chunks of memory.
> 
> Various chips require contiguous blocks of memory to operate.  Those
> chips include devices such as cameras, hardware video decoders and
> encoders, etc.
> 
> The code is highly modular and customisable to suit the needs of
> various users.  Set of regions reserved for CMA can be configured on
> run-time and it is easy to add custom allocator algorithms if one
> has such need.
> 
> For more details see Documentation/contiguous-memory.txt.
> 
> Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> Reviewed-by: Pawel Osciak <p.osciak@samsung.com>
> ---
>  Documentation/00-INDEX                             |    2 +
>  .../ABI/testing/sysfs-kernel-mm-contiguous         |    9 +
>  Documentation/contiguous-memory.txt                |  646 +++++++++++
>  Documentation/kernel-parameters.txt                |    4 +
>  include/linux/cma.h                                |  445 ++++++++
>  mm/Kconfig                                         |   34 +
>  mm/Makefile                                        |    3 +
>  mm/cma-best-fit.c                                  |  407 +++++++
>  mm/cma.c                                           | 1170 ++++++++++++++++++++
>  9 files changed, 2720 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous
>  create mode 100644 Documentation/contiguous-memory.txt
>  create mode 100644 include/linux/cma.h
>  create mode 100644 mm/cma-best-fit.c
>  create mode 100644 mm/cma.c
> 
> diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
> index 5405f7a..bb50209 100644
> --- a/Documentation/00-INDEX
> +++ b/Documentation/00-INDEX
> @@ -94,6 +94,8 @@ connector/
>  	- docs on the netlink based userspace<->kernel space communication mod.
>  console/
>  	- documentation on Linux console drivers.
> +contiguous-memory.txt
> +	- documentation on physically-contiguous memory allocation framework.
>  cpu-freq/
>  	- info on CPU frequency and voltage scaling.
>  cpu-hotplug.txt
> diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-contiguous b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> new file mode 100644
> index 0000000..05e2f6a
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> @@ -0,0 +1,9 @@
> +What:		/sys/kernel/mm/contiguous/
> +Date:		July 2008
> +Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
> +Description:
> +		/sys/kernel/mm/contiguous/ contains two files: asterisk and
> +		map.  They are used to configure the Contiguous Memory
> +		Allocator framework.
> +
> +		For details see Documentation/contiguous-memory.txt.
> diff --git a/Documentation/contiguous-memory.txt b/Documentation/contiguous-memory.txt
> new file mode 100644
> index 0000000..6eb1295
> --- /dev/null
> +++ b/Documentation/contiguous-memory.txt
> @@ -0,0 +1,646 @@
> +                                                             -*- org -*-
> +
> +* Contiguous Memory Allocator
> +
> +   The Contiguous Memory Allocator (CMA) is a framework, which allows
> +   setting up a machine-specific configuration for physically-contiguous
> +   memory management. Memory for devices is then allocated according
> +   to that configuration.
> +
> +   The main role of the framework is not to allocate memory, but to
> +   parse and manage memory configurations, as well as to act as an
> +   in-between between device drivers and pluggable allocators. It is
> +   thus not tied to any memory allocation method or strategy.
> +
> +** Why is it needed?
> +
> +    Various devices on embedded systems have no scatter-getter and/or
> +    IO map support and as such require contiguous blocks of memory to
> +    operate.  They include devices such as cameras, hardware video
> +    decoders and encoders, etc.
> +
> +    Such devices often require big memory buffers (a full HD frame is,
> +    for instance, more then 2 mega pixels large, i.e. more than 6 MB
> +    of memory), which makes mechanisms such as kmalloc() ineffective.
> +
> +    Some embedded devices impose additional requirements on the
> +    buffers, e.g. they can operate only on buffers allocated in
> +    particular location/memory bank (if system has more than one
> +    memory bank) or buffers aligned to a particular memory boundary.
> +
> +    Development of embedded devices have seen a big rise recently
> +    (especially in the V4L area) and many such drivers include their
> +    own memory allocation code. Most of them use bootmem-based methods.
> +    CMA framework is an attempt to unify contiguous memory allocation
> +    mechanisms and provide a simple API for device drivers, while
> +    staying as customisable and modular as possible.
> +
> +** Design
> +
> +    The main design goal for the CMA was to provide a customisable and
> +    modular framework, which could be configured to suit the needs of
> +    individual systems.  Configuration specifies a list of memory
> +    regions, which then are assigned to devices.  Memory regions can
> +    be shared among many device drivers or assigned exclusively to
> +    one.  This has been achieved in the following ways:

OK, I like the idea of regions, i.e. defining memory areas with specific
properties or uses.

But why should it be possible to define regions through kernel parameters?
Regions are typically fixed for a particular platform and can be setup in the
platform specific code. Actually, one region could be setup by default:
DMA-able memory. That would be very handy in fact for many PCI-based TV
capture drivers.

I think that the only thing that you want to set in the kernel params is the
size of each region.

The same with assigning regions to drivers: why would you want to do that?
The driver should know which regions it can use (with possible fallbacks).
And it can know that provided regions are setup by the platform code and not
created dynamically. This will simplify things enormously.

> +    1. The core of the CMA does not handle allocation of memory and
> +       management of free space.  Dedicated allocators are used for
> +       that purpose.
> +
> +       This way, if the provided solution does not match demands
> +       imposed on a given system, one can develop a new algorithm and
> +       easily plug it into the CMA framework.
> +
> +       The presented solution includes an implementation of a best-fit
> +       algorithm.

Again, do we really need user-settable per-region allocators? Just provide
one with the option to later choose others through the kernel Kconfig files.

We can always add more complex scenarios later, but for an initial version
I'd keep it simple.

> +
> +    2. CMA allows a run-time configuration of the memory regions it
> +       will use to allocate chunks of memory from.  The set of memory
> +       regions is given on command line so it can be easily changed
> +       without the need for recompiling the kernel.
> +
> +       Each region has it's own size, alignment demand, a start
> +       address (physical address where it should be placed) and an
> +       allocator algorithm assigned to the region.
> +
> +       This means that there can be different algorithms running at
> +       the same time, if different devices on the platform have
> +       distinct memory usage characteristics and different algorithm
> +       match those the best way.

Seems overengineering to me. Just ensure that the code can be extended later
to such hypothetical scenarios. They are hypothetical, right?

> +    3. When requesting memory, devices have to introduce themselves.
> +       This way CMA knows who the memory is allocated for.  This
> +       allows for the system architect to specify which memory regions
> +       each device should use.
> +
> +       3a. Devices can also specify a "kind" of memory they want.
> +           This makes it possible to configure the system in such
> +           a way, that a single device may get memory from different
> +           memory regions, depending on the "kind" of memory it
> +           requested.  For example, a video codec driver might want to
> +           allocate some shared buffers from the first memory bank and
> +           the other from the second to get the highest possible
> +           memory throughput.

Not sure I understand this. Isn't this just two regions, one for each memory bank,
and the driver requests some buffers from one region and some from the other?
Not sure how a 'kind of memory' features in this.

> +    4. For greater flexibility and extensibility, the framework allows
> +       device drivers to register private regions of reserved memory
> +       which then may be used only by them.
> +
> +       As an effect, if a driver would not use the rest of the CMA
> +       interface, it can still use CMA allocators and other
> +       mechanisms.

Why would you? Is there an actual driver that will need this?

> +
> +       4a. Early in boot process, device drivers can also request the
> +           CMA framework to a reserve a region of memory for them
> +           which then will be used as a private region.
> +
> +           This way, drivers do not need to directly call bootmem,
> +           memblock or similar early allocator but merely register an
> +           early region and the framework will handle the rest
> +           including choosing the right early allocator.

The whole concept of private regions seems unnecessary to me.

<big snip>

It looks to me as if you tried to think of all possible hypothetical situations
and write a framework for that. Of course, you may know more than I do, and some
of these situations actually happen.

The basic design ideas are solid, I think. But you should get rid of all the
fancy features and go back to basics. We can always add those features later
should that become necessary. But removing features is much, much harder.

Regards,

	Hans

-- 
Hans Verkuil - video4linux developer - sponsored by TANDBERG, part of Cisco

^ permalink raw reply	[flat|nested] 57+ messages in thread

* RE: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
  2010-07-26 20:28       ` Hans Verkuil
  (?)
@ 2010-07-27  7:41         ` Marek Szyprowski
  -1 siblings, 0 replies; 57+ messages in thread
From: Marek Szyprowski @ 2010-07-27  7:41 UTC (permalink / raw)
  To: 'Hans Verkuil', Michal Nazarewicz
  Cc: linux-mm, linux-kernel, linux-media, linux-arm-kernel,
	'Hiremath Vaibhav', Pawel Osciak, 'Mark Brown',
	'Daniel Walker', 'Jonathan Corbet',
	'FUJITA Tomonori', 'Zach Pfeffer',
	'Kyungmin Park'

Hello,

On Monday, July 26, 2010 10:29 PM Hans Verkuil wrote:

> Hi Michal,
> 
> Thanks for working on this, we definitely need something along these lines.
> 
> On Monday 26 July 2010 16:40:30 Michal Nazarewicz wrote:
> > The Contiguous Memory Allocator framework is a set of APIs for
> > allocating physically contiguous chunks of memory.
> >
> > Various chips require contiguous blocks of memory to operate.  Those
> > chips include devices such as cameras, hardware video decoders and
> > encoders, etc.
> >
> > The code is highly modular and customisable to suit the needs of
> > various users.  Set of regions reserved for CMA can be configured on
> > run-time and it is easy to add custom allocator algorithms if one
> > has such need.
> >
> > For more details see Documentation/contiguous-memory.txt.
> >
> > Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
> > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> > Reviewed-by: Pawel Osciak <p.osciak@samsung.com>
> > ---
> >  Documentation/00-INDEX                             |    2 +
> >  .../ABI/testing/sysfs-kernel-mm-contiguous         |    9 +
> >  Documentation/contiguous-memory.txt                |  646 +++++++++++
> >  Documentation/kernel-parameters.txt                |    4 +
> >  include/linux/cma.h                                |  445 ++++++++
> >  mm/Kconfig                                         |   34 +
> >  mm/Makefile                                        |    3 +
> >  mm/cma-best-fit.c                                  |  407 +++++++
> >  mm/cma.c                                           | 1170
> ++++++++++++++++++++
> >  9 files changed, 2720 insertions(+), 0 deletions(-)
> >  create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> >  create mode 100644 Documentation/contiguous-memory.txt
> >  create mode 100644 include/linux/cma.h
> >  create mode 100644 mm/cma-best-fit.c
> >  create mode 100644 mm/cma.c
> >
> > diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
> > index 5405f7a..bb50209 100644
> > --- a/Documentation/00-INDEX
> > +++ b/Documentation/00-INDEX
> > @@ -94,6 +94,8 @@ connector/
> >  	- docs on the netlink based userspace<->kernel space communication
> mod.
> >  console/
> >  	- documentation on Linux console drivers.
> > +contiguous-memory.txt
> > +	- documentation on physically-contiguous memory allocation framework.
> >  cpu-freq/
> >  	- info on CPU frequency and voltage scaling.
> >  cpu-hotplug.txt
> > diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > new file mode 100644
> > index 0000000..05e2f6a
> > --- /dev/null
> > +++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > @@ -0,0 +1,9 @@
> > +What:		/sys/kernel/mm/contiguous/
> > +Date:		July 2008
> > +Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
> > +Description:
> > +		/sys/kernel/mm/contiguous/ contains two files: asterisk and
> > +		map.  They are used to configure the Contiguous Memory
> > +		Allocator framework.
> > +
> > +		For details see Documentation/contiguous-memory.txt.
> > diff --git a/Documentation/contiguous-memory.txt
> b/Documentation/contiguous-memory.txt
> > new file mode 100644
> > index 0000000..6eb1295
> > --- /dev/null
> > +++ b/Documentation/contiguous-memory.txt
> > @@ -0,0 +1,646 @@
> > +                                                             -*- org -*-
> > +
> > +* Contiguous Memory Allocator
> > +
> > +   The Contiguous Memory Allocator (CMA) is a framework, which allows
> > +   setting up a machine-specific configuration for physically-contiguous
> > +   memory management. Memory for devices is then allocated according
> > +   to that configuration.
> > +
> > +   The main role of the framework is not to allocate memory, but to
> > +   parse and manage memory configurations, as well as to act as an
> > +   in-between between device drivers and pluggable allocators. It is
> > +   thus not tied to any memory allocation method or strategy.
> > +
> > +** Why is it needed?
> > +
> > +    Various devices on embedded systems have no scatter-getter and/or
> > +    IO map support and as such require contiguous blocks of memory to
> > +    operate.  They include devices such as cameras, hardware video
> > +    decoders and encoders, etc.
> > +
> > +    Such devices often require big memory buffers (a full HD frame is,
> > +    for instance, more then 2 mega pixels large, i.e. more than 6 MB
> > +    of memory), which makes mechanisms such as kmalloc() ineffective.
> > +
> > +    Some embedded devices impose additional requirements on the
> > +    buffers, e.g. they can operate only on buffers allocated in
> > +    particular location/memory bank (if system has more than one
> > +    memory bank) or buffers aligned to a particular memory boundary.
> > +
> > +    Development of embedded devices have seen a big rise recently
> > +    (especially in the V4L area) and many such drivers include their
> > +    own memory allocation code. Most of them use bootmem-based methods.
> > +    CMA framework is an attempt to unify contiguous memory allocation
> > +    mechanisms and provide a simple API for device drivers, while
> > +    staying as customisable and modular as possible.
> > +
> > +** Design
> > +
> > +    The main design goal for the CMA was to provide a customisable and
> > +    modular framework, which could be configured to suit the needs of
> > +    individual systems.  Configuration specifies a list of memory
> > +    regions, which then are assigned to devices.  Memory regions can
> > +    be shared among many device drivers or assigned exclusively to
> > +    one.  This has been achieved in the following ways:
> 
> OK, I like the idea of regions, i.e. defining memory areas with specific
> properties or uses.
> 
> But why should it be possible to define regions through kernel parameters?
> Regions are typically fixed for a particular platform and can be setup in
> the
> platform specific code. Actually, one region could be setup by default:
> DMA-able memory. That would be very handy in fact for many PCI-based TV
> capture drivers.

IMHO this is a just desktop-point-of-view. In embedded world things are
a bit different. Most SoCs have a some kind of common system memory and
usually all build-in peripherals are able to DMA to any part of it (there is
no DMA specific hardware zone).

> I think that the only thing that you want to set in the kernel params is
> the size of each region.

Keeping it as a kernel parameter is very handy for development. But I agree
that we might make it dependent on some Kconfig entry. This way a platform
setup code would provide default region description just as an array of the
region structures and we will get rid of the parsing code in the release
versions.
 
> The same with assigning regions to drivers: why would you want to do that?
> The driver should know which regions it can use (with possible fallbacks).

I'm sorry, but this is again a little 'desktop-centric point-of-view'. On
desktop it is perfectly acceptable to have a separate memory region for each
device. In embedded world memory is a precious resource. Of course we can go
the 'separate memory region for each device' way, but we observed that at
least some memory can be recovered if we decide to share memory regions for
some of the devices.

Assigning regions to the drivers is a way to describe how memory can be
shared. This is something that is independent from the actual drivers. 
Device drivers cannot and mustn't have such knowledge. 

> And it can know that provided regions are setup by the platform code and
> not created dynamically. This will simplify things enormously.
> 
> > +    1. The core of the CMA does not handle allocation of memory and
> > +       management of free space.  Dedicated allocators are used for
> > +       that purpose.
> > +
> > +       This way, if the provided solution does not match demands
> > +       imposed on a given system, one can develop a new algorithm and
> > +       easily plug it into the CMA framework.
> > +
> > +       The presented solution includes an implementation of a best-fit
> > +       algorithm.
> 
> Again, do we really need user-settable per-region allocators? Just provide
> one with the option to later choose others through the kernel Kconfig files.

>From our experience, yes. Different allocators can cope with different memory
usage scenarios better or worse. This results in higher or lower memory
fragmentation. System use cases are something that kernel or drivers are
definitely not aware, so only user space can tune this parameter to get the
best possible system behavior.

> We can always add more complex scenarios later, but for an initial version
> I'd keep it simple.
> 
> > +
> > +    2. CMA allows a run-time configuration of the memory regions it
> > +       will use to allocate chunks of memory from.  The set of memory
> > +       regions is given on command line so it can be easily changed
> > +       without the need for recompiling the kernel.
> > +
> > +       Each region has it's own size, alignment demand, a start
> > +       address (physical address where it should be placed) and an
> > +       allocator algorithm assigned to the region.
> > +
> > +       This means that there can be different algorithms running at
> > +       the same time, if different devices on the platform have
> > +       distinct memory usage characteristics and different algorithm
> > +       match those the best way.
> 
> Seems overengineering to me. Just ensure that the code can be extended
> later to such hypothetical scenarios. They are hypothetical, right?

Not really. Having the possibility to reconfigure memory configuration
without kernel recompilation is very handy when one is tuning the
configuration for the specific use case.

> > +    3. When requesting memory, devices have to introduce themselves.
> > +       This way CMA knows who the memory is allocated for.  This
> > +       allows for the system architect to specify which memory regions
> > +       each device should use.
> > +
> > +       3a. Devices can also specify a "kind" of memory they want.
> > +           This makes it possible to configure the system in such
> > +           a way, that a single device may get memory from different
> > +           memory regions, depending on the "kind" of memory it
> > +           requested.  For example, a video codec driver might want to
> > +           allocate some shared buffers from the first memory bank and
> > +           the other from the second to get the highest possible
> > +           memory throughput.
> 
> Not sure I understand this. Isn't this just two regions, one for each
> memory bank,
> and the driver requests some buffers from one region and some from the
> other?

Right.

> Not sure how a 'kind of memory' features in this.

This 'kind' is a just cookie or a label used by the driver to distinguish
requests for both memory banks. This functionality is essential for our
hardware (just for hardware video codec we have 3 'kinds' of memory: memory
bank A, memory bank B and special region for the firmware).

> > +    4. For greater flexibility and extensibility, the framework allows
> > +       device drivers to register private regions of reserved memory
> > +       which then may be used only by them.
> > +
> > +       As an effect, if a driver would not use the rest of the CMA
> > +       interface, it can still use CMA allocators and other
> > +       mechanisms.
> 
> Why would you? Is there an actual driver that will need this?

This feature has been added after posting v1 of this rfc/patch. Jonathan 
Corbet suggested in <http://article.gmane.org/gmane.linux.kernel.mm/50689> 
that viafb driver might register its own private memory and use cma just
as an allocator. IMHO this is a good idea, this way we might remove a bunch
of custom allocators from the drivers (yes, there are such all over the
kernel).

> > +       4a. Early in boot process, device drivers can also request the
> > +           CMA framework to a reserve a region of memory for them
> > +           which then will be used as a private region.
> > +
> > +           This way, drivers do not need to directly call bootmem,
> > +           memblock or similar early allocator but merely register an
> > +           early region and the framework will handle the rest
> > +           including choosing the right early allocator.
> 
> The whole concept of private regions seems unnecessary to me.
> 
> <big snip>
> 
> It looks to me as if you tried to think of all possible hypothetical
> situations
> and write a framework for that. Of course, you may know more than I do, and
> some of these situations actually happen.

Not exactly. We tried to design a solution that would cover all requirements
for OUR (quite specific) embedded hardware. However we didn't want to tie it
only to our platform. We just generalized most of our requirements so they can
be reused for other systems.

> The basic design ideas are solid, I think. But you should get rid of all
> the fancy features and go back to basics. We can always add those features
> later should that become necessary. But removing features is much, much
harder.

Well, please keep in mind that we cannot remove features that are essential
for our solution. We know that a simple framework have some advantages
(well, the most important one is the fact that it is easy to understand),
but making it too simple would render it useless from our point of view
(if it would not provide functionality required by our drivers and hardware).

Best regards
--
Marek Szyprowski
Samsung Poland R&D Center




^ permalink raw reply	[flat|nested] 57+ messages in thread

* RE: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-27  7:41         ` Marek Szyprowski
  0 siblings, 0 replies; 57+ messages in thread
From: Marek Szyprowski @ 2010-07-27  7:41 UTC (permalink / raw)
  To: 'Hans Verkuil', Michal Nazarewicz
  Cc: linux-mm, linux-kernel, linux-media, linux-arm-kernel,
	'Hiremath Vaibhav', Pawel Osciak, 'Mark Brown',
	'Daniel Walker', 'Jonathan Corbet',
	'FUJITA Tomonori', 'Zach Pfeffer',
	'Kyungmin Park'

Hello,

On Monday, July 26, 2010 10:29 PM Hans Verkuil wrote:

> Hi Michal,
> 
> Thanks for working on this, we definitely need something along these lines.
> 
> On Monday 26 July 2010 16:40:30 Michal Nazarewicz wrote:
> > The Contiguous Memory Allocator framework is a set of APIs for
> > allocating physically contiguous chunks of memory.
> >
> > Various chips require contiguous blocks of memory to operate.  Those
> > chips include devices such as cameras, hardware video decoders and
> > encoders, etc.
> >
> > The code is highly modular and customisable to suit the needs of
> > various users.  Set of regions reserved for CMA can be configured on
> > run-time and it is easy to add custom allocator algorithms if one
> > has such need.
> >
> > For more details see Documentation/contiguous-memory.txt.
> >
> > Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
> > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> > Reviewed-by: Pawel Osciak <p.osciak@samsung.com>
> > ---
> >  Documentation/00-INDEX                             |    2 +
> >  .../ABI/testing/sysfs-kernel-mm-contiguous         |    9 +
> >  Documentation/contiguous-memory.txt                |  646 +++++++++++
> >  Documentation/kernel-parameters.txt                |    4 +
> >  include/linux/cma.h                                |  445 ++++++++
> >  mm/Kconfig                                         |   34 +
> >  mm/Makefile                                        |    3 +
> >  mm/cma-best-fit.c                                  |  407 +++++++
> >  mm/cma.c                                           | 1170
> ++++++++++++++++++++
> >  9 files changed, 2720 insertions(+), 0 deletions(-)
> >  create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> >  create mode 100644 Documentation/contiguous-memory.txt
> >  create mode 100644 include/linux/cma.h
> >  create mode 100644 mm/cma-best-fit.c
> >  create mode 100644 mm/cma.c
> >
> > diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
> > index 5405f7a..bb50209 100644
> > --- a/Documentation/00-INDEX
> > +++ b/Documentation/00-INDEX
> > @@ -94,6 +94,8 @@ connector/
> >  	- docs on the netlink based userspace<->kernel space communication
> mod.
> >  console/
> >  	- documentation on Linux console drivers.
> > +contiguous-memory.txt
> > +	- documentation on physically-contiguous memory allocation framework.
> >  cpu-freq/
> >  	- info on CPU frequency and voltage scaling.
> >  cpu-hotplug.txt
> > diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > new file mode 100644
> > index 0000000..05e2f6a
> > --- /dev/null
> > +++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > @@ -0,0 +1,9 @@
> > +What:		/sys/kernel/mm/contiguous/
> > +Date:		July 2008
> > +Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
> > +Description:
> > +		/sys/kernel/mm/contiguous/ contains two files: asterisk and
> > +		map.  They are used to configure the Contiguous Memory
> > +		Allocator framework.
> > +
> > +		For details see Documentation/contiguous-memory.txt.
> > diff --git a/Documentation/contiguous-memory.txt
> b/Documentation/contiguous-memory.txt
> > new file mode 100644
> > index 0000000..6eb1295
> > --- /dev/null
> > +++ b/Documentation/contiguous-memory.txt
> > @@ -0,0 +1,646 @@
> > +                                                             -*- org -*-
> > +
> > +* Contiguous Memory Allocator
> > +
> > +   The Contiguous Memory Allocator (CMA) is a framework, which allows
> > +   setting up a machine-specific configuration for physically-contiguous
> > +   memory management. Memory for devices is then allocated according
> > +   to that configuration.
> > +
> > +   The main role of the framework is not to allocate memory, but to
> > +   parse and manage memory configurations, as well as to act as an
> > +   in-between between device drivers and pluggable allocators. It is
> > +   thus not tied to any memory allocation method or strategy.
> > +
> > +** Why is it needed?
> > +
> > +    Various devices on embedded systems have no scatter-getter and/or
> > +    IO map support and as such require contiguous blocks of memory to
> > +    operate.  They include devices such as cameras, hardware video
> > +    decoders and encoders, etc.
> > +
> > +    Such devices often require big memory buffers (a full HD frame is,
> > +    for instance, more then 2 mega pixels large, i.e. more than 6 MB
> > +    of memory), which makes mechanisms such as kmalloc() ineffective.
> > +
> > +    Some embedded devices impose additional requirements on the
> > +    buffers, e.g. they can operate only on buffers allocated in
> > +    particular location/memory bank (if system has more than one
> > +    memory bank) or buffers aligned to a particular memory boundary.
> > +
> > +    Development of embedded devices have seen a big rise recently
> > +    (especially in the V4L area) and many such drivers include their
> > +    own memory allocation code. Most of them use bootmem-based methods.
> > +    CMA framework is an attempt to unify contiguous memory allocation
> > +    mechanisms and provide a simple API for device drivers, while
> > +    staying as customisable and modular as possible.
> > +
> > +** Design
> > +
> > +    The main design goal for the CMA was to provide a customisable and
> > +    modular framework, which could be configured to suit the needs of
> > +    individual systems.  Configuration specifies a list of memory
> > +    regions, which then are assigned to devices.  Memory regions can
> > +    be shared among many device drivers or assigned exclusively to
> > +    one.  This has been achieved in the following ways:
> 
> OK, I like the idea of regions, i.e. defining memory areas with specific
> properties or uses.
> 
> But why should it be possible to define regions through kernel parameters?
> Regions are typically fixed for a particular platform and can be setup in
> the
> platform specific code. Actually, one region could be setup by default:
> DMA-able memory. That would be very handy in fact for many PCI-based TV
> capture drivers.

IMHO this is a just desktop-point-of-view. In embedded world things are
a bit different. Most SoCs have a some kind of common system memory and
usually all build-in peripherals are able to DMA to any part of it (there is
no DMA specific hardware zone).

> I think that the only thing that you want to set in the kernel params is
> the size of each region.

Keeping it as a kernel parameter is very handy for development. But I agree
that we might make it dependent on some Kconfig entry. This way a platform
setup code would provide default region description just as an array of the
region structures and we will get rid of the parsing code in the release
versions.
 
> The same with assigning regions to drivers: why would you want to do that?
> The driver should know which regions it can use (with possible fallbacks).

I'm sorry, but this is again a little 'desktop-centric point-of-view'. On
desktop it is perfectly acceptable to have a separate memory region for each
device. In embedded world memory is a precious resource. Of course we can go
the 'separate memory region for each device' way, but we observed that at
least some memory can be recovered if we decide to share memory regions for
some of the devices.

Assigning regions to the drivers is a way to describe how memory can be
shared. This is something that is independent from the actual drivers. 
Device drivers cannot and mustn't have such knowledge. 

> And it can know that provided regions are setup by the platform code and
> not created dynamically. This will simplify things enormously.
> 
> > +    1. The core of the CMA does not handle allocation of memory and
> > +       management of free space.  Dedicated allocators are used for
> > +       that purpose.
> > +
> > +       This way, if the provided solution does not match demands
> > +       imposed on a given system, one can develop a new algorithm and
> > +       easily plug it into the CMA framework.
> > +
> > +       The presented solution includes an implementation of a best-fit
> > +       algorithm.
> 
> Again, do we really need user-settable per-region allocators? Just provide
> one with the option to later choose others through the kernel Kconfig files.

>From our experience, yes. Different allocators can cope with different memory
usage scenarios better or worse. This results in higher or lower memory
fragmentation. System use cases are something that kernel or drivers are
definitely not aware, so only user space can tune this parameter to get the
best possible system behavior.

> We can always add more complex scenarios later, but for an initial version
> I'd keep it simple.
> 
> > +
> > +    2. CMA allows a run-time configuration of the memory regions it
> > +       will use to allocate chunks of memory from.  The set of memory
> > +       regions is given on command line so it can be easily changed
> > +       without the need for recompiling the kernel.
> > +
> > +       Each region has it's own size, alignment demand, a start
> > +       address (physical address where it should be placed) and an
> > +       allocator algorithm assigned to the region.
> > +
> > +       This means that there can be different algorithms running at
> > +       the same time, if different devices on the platform have
> > +       distinct memory usage characteristics and different algorithm
> > +       match those the best way.
> 
> Seems overengineering to me. Just ensure that the code can be extended
> later to such hypothetical scenarios. They are hypothetical, right?

Not really. Having the possibility to reconfigure memory configuration
without kernel recompilation is very handy when one is tuning the
configuration for the specific use case.

> > +    3. When requesting memory, devices have to introduce themselves.
> > +       This way CMA knows who the memory is allocated for.  This
> > +       allows for the system architect to specify which memory regions
> > +       each device should use.
> > +
> > +       3a. Devices can also specify a "kind" of memory they want.
> > +           This makes it possible to configure the system in such
> > +           a way, that a single device may get memory from different
> > +           memory regions, depending on the "kind" of memory it
> > +           requested.  For example, a video codec driver might want to
> > +           allocate some shared buffers from the first memory bank and
> > +           the other from the second to get the highest possible
> > +           memory throughput.
> 
> Not sure I understand this. Isn't this just two regions, one for each
> memory bank,
> and the driver requests some buffers from one region and some from the
> other?

Right.

> Not sure how a 'kind of memory' features in this.

This 'kind' is a just cookie or a label used by the driver to distinguish
requests for both memory banks. This functionality is essential for our
hardware (just for hardware video codec we have 3 'kinds' of memory: memory
bank A, memory bank B and special region for the firmware).

> > +    4. For greater flexibility and extensibility, the framework allows
> > +       device drivers to register private regions of reserved memory
> > +       which then may be used only by them.
> > +
> > +       As an effect, if a driver would not use the rest of the CMA
> > +       interface, it can still use CMA allocators and other
> > +       mechanisms.
> 
> Why would you? Is there an actual driver that will need this?

This feature has been added after posting v1 of this rfc/patch. Jonathan 
Corbet suggested in <http://article.gmane.org/gmane.linux.kernel.mm/50689> 
that viafb driver might register its own private memory and use cma just
as an allocator. IMHO this is a good idea, this way we might remove a bunch
of custom allocators from the drivers (yes, there are such all over the
kernel).

> > +       4a. Early in boot process, device drivers can also request the
> > +           CMA framework to a reserve a region of memory for them
> > +           which then will be used as a private region.
> > +
> > +           This way, drivers do not need to directly call bootmem,
> > +           memblock or similar early allocator but merely register an
> > +           early region and the framework will handle the rest
> > +           including choosing the right early allocator.
> 
> The whole concept of private regions seems unnecessary to me.
> 
> <big snip>
> 
> It looks to me as if you tried to think of all possible hypothetical
> situations
> and write a framework for that. Of course, you may know more than I do, and
> some of these situations actually happen.

Not exactly. We tried to design a solution that would cover all requirements
for OUR (quite specific) embedded hardware. However we didn't want to tie it
only to our platform. We just generalized most of our requirements so they can
be reused for other systems.

> The basic design ideas are solid, I think. But you should get rid of all
> the fancy features and go back to basics. We can always add those features
> later should that become necessary. But removing features is much, much
harder.

Well, please keep in mind that we cannot remove features that are essential
for our solution. We know that a simple framework have some advantages
(well, the most important one is the fact that it is easy to understand),
but making it too simple would render it useless from our point of view
(if it would not provide functionality required by our drivers and hardware).

Best regards
--
Marek Szyprowski
Samsung Poland R&D Center



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-27  7:41         ` Marek Szyprowski
  0 siblings, 0 replies; 57+ messages in thread
From: Marek Szyprowski @ 2010-07-27  7:41 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Monday, July 26, 2010 10:29 PM Hans Verkuil wrote:

> Hi Michal,
> 
> Thanks for working on this, we definitely need something along these lines.
> 
> On Monday 26 July 2010 16:40:30 Michal Nazarewicz wrote:
> > The Contiguous Memory Allocator framework is a set of APIs for
> > allocating physically contiguous chunks of memory.
> >
> > Various chips require contiguous blocks of memory to operate.  Those
> > chips include devices such as cameras, hardware video decoders and
> > encoders, etc.
> >
> > The code is highly modular and customisable to suit the needs of
> > various users.  Set of regions reserved for CMA can be configured on
> > run-time and it is easy to add custom allocator algorithms if one
> > has such need.
> >
> > For more details see Documentation/contiguous-memory.txt.
> >
> > Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
> > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> > Reviewed-by: Pawel Osciak <p.osciak@samsung.com>
> > ---
> >  Documentation/00-INDEX                             |    2 +
> >  .../ABI/testing/sysfs-kernel-mm-contiguous         |    9 +
> >  Documentation/contiguous-memory.txt                |  646 +++++++++++
> >  Documentation/kernel-parameters.txt                |    4 +
> >  include/linux/cma.h                                |  445 ++++++++
> >  mm/Kconfig                                         |   34 +
> >  mm/Makefile                                        |    3 +
> >  mm/cma-best-fit.c                                  |  407 +++++++
> >  mm/cma.c                                           | 1170
> ++++++++++++++++++++
> >  9 files changed, 2720 insertions(+), 0 deletions(-)
> >  create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> >  create mode 100644 Documentation/contiguous-memory.txt
> >  create mode 100644 include/linux/cma.h
> >  create mode 100644 mm/cma-best-fit.c
> >  create mode 100644 mm/cma.c
> >
> > diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
> > index 5405f7a..bb50209 100644
> > --- a/Documentation/00-INDEX
> > +++ b/Documentation/00-INDEX
> > @@ -94,6 +94,8 @@ connector/
> >  	- docs on the netlink based userspace<->kernel space communication
> mod.
> >  console/
> >  	- documentation on Linux console drivers.
> > +contiguous-memory.txt
> > +	- documentation on physically-contiguous memory allocation framework.
> >  cpu-freq/
> >  	- info on CPU frequency and voltage scaling.
> >  cpu-hotplug.txt
> > diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > new file mode 100644
> > index 0000000..05e2f6a
> > --- /dev/null
> > +++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > @@ -0,0 +1,9 @@
> > +What:		/sys/kernel/mm/contiguous/
> > +Date:		July 2008
> > +Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
> > +Description:
> > +		/sys/kernel/mm/contiguous/ contains two files: asterisk and
> > +		map.  They are used to configure the Contiguous Memory
> > +		Allocator framework.
> > +
> > +		For details see Documentation/contiguous-memory.txt.
> > diff --git a/Documentation/contiguous-memory.txt
> b/Documentation/contiguous-memory.txt
> > new file mode 100644
> > index 0000000..6eb1295
> > --- /dev/null
> > +++ b/Documentation/contiguous-memory.txt
> > @@ -0,0 +1,646 @@
> > +                                                             -*- org -*-
> > +
> > +* Contiguous Memory Allocator
> > +
> > +   The Contiguous Memory Allocator (CMA) is a framework, which allows
> > +   setting up a machine-specific configuration for physically-contiguous
> > +   memory management. Memory for devices is then allocated according
> > +   to that configuration.
> > +
> > +   The main role of the framework is not to allocate memory, but to
> > +   parse and manage memory configurations, as well as to act as an
> > +   in-between between device drivers and pluggable allocators. It is
> > +   thus not tied to any memory allocation method or strategy.
> > +
> > +** Why is it needed?
> > +
> > +    Various devices on embedded systems have no scatter-getter and/or
> > +    IO map support and as such require contiguous blocks of memory to
> > +    operate.  They include devices such as cameras, hardware video
> > +    decoders and encoders, etc.
> > +
> > +    Such devices often require big memory buffers (a full HD frame is,
> > +    for instance, more then 2 mega pixels large, i.e. more than 6 MB
> > +    of memory), which makes mechanisms such as kmalloc() ineffective.
> > +
> > +    Some embedded devices impose additional requirements on the
> > +    buffers, e.g. they can operate only on buffers allocated in
> > +    particular location/memory bank (if system has more than one
> > +    memory bank) or buffers aligned to a particular memory boundary.
> > +
> > +    Development of embedded devices have seen a big rise recently
> > +    (especially in the V4L area) and many such drivers include their
> > +    own memory allocation code. Most of them use bootmem-based methods.
> > +    CMA framework is an attempt to unify contiguous memory allocation
> > +    mechanisms and provide a simple API for device drivers, while
> > +    staying as customisable and modular as possible.
> > +
> > +** Design
> > +
> > +    The main design goal for the CMA was to provide a customisable and
> > +    modular framework, which could be configured to suit the needs of
> > +    individual systems.  Configuration specifies a list of memory
> > +    regions, which then are assigned to devices.  Memory regions can
> > +    be shared among many device drivers or assigned exclusively to
> > +    one.  This has been achieved in the following ways:
> 
> OK, I like the idea of regions, i.e. defining memory areas with specific
> properties or uses.
> 
> But why should it be possible to define regions through kernel parameters?
> Regions are typically fixed for a particular platform and can be setup in
> the
> platform specific code. Actually, one region could be setup by default:
> DMA-able memory. That would be very handy in fact for many PCI-based TV
> capture drivers.

IMHO this is a just desktop-point-of-view. In embedded world things are
a bit different. Most SoCs have a some kind of common system memory and
usually all build-in peripherals are able to DMA to any part of it (there is
no DMA specific hardware zone).

> I think that the only thing that you want to set in the kernel params is
> the size of each region.

Keeping it as a kernel parameter is very handy for development. But I agree
that we might make it dependent on some Kconfig entry. This way a platform
setup code would provide default region description just as an array of the
region structures and we will get rid of the parsing code in the release
versions.
 
> The same with assigning regions to drivers: why would you want to do that?
> The driver should know which regions it can use (with possible fallbacks).

I'm sorry, but this is again a little 'desktop-centric point-of-view'. On
desktop it is perfectly acceptable to have a separate memory region for each
device. In embedded world memory is a precious resource. Of course we can go
the 'separate memory region for each device' way, but we observed that at
least some memory can be recovered if we decide to share memory regions for
some of the devices.

Assigning regions to the drivers is a way to describe how memory can be
shared. This is something that is independent from the actual drivers. 
Device drivers cannot and mustn't have such knowledge. 

> And it can know that provided regions are setup by the platform code and
> not created dynamically. This will simplify things enormously.
> 
> > +    1. The core of the CMA does not handle allocation of memory and
> > +       management of free space.  Dedicated allocators are used for
> > +       that purpose.
> > +
> > +       This way, if the provided solution does not match demands
> > +       imposed on a given system, one can develop a new algorithm and
> > +       easily plug it into the CMA framework.
> > +
> > +       The presented solution includes an implementation of a best-fit
> > +       algorithm.
> 
> Again, do we really need user-settable per-region allocators? Just provide
> one with the option to later choose others through the kernel Kconfig files.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
  2010-07-26 14:11     ` Michal Nazarewicz
  (?)
@ 2010-07-27 12:08       ` Russell King - ARM Linux
  -1 siblings, 0 replies; 57+ messages in thread
From: Russell King - ARM Linux @ 2010-07-27 12:08 UTC (permalink / raw)
  To: Michal Nazarewicz
  Cc: linux-mm, Daniel Walker, Jonathan Corbet, Pawel Osciak,
	Mark Brown, linux-kernel, Hiremath Vaibhav, FUJITA Tomonori,
	Kyungmin Park, Zach Pfeffer, linux-media, linux-arm-kernel,
	Marek Szyprowski

On Mon, Jul 26, 2010 at 04:40:30PM +0200, Michal Nazarewicz wrote:
> +** Why is it needed?
> +
> +    Various devices on embedded systems have no scatter-getter and/or
> +    IO map support and as such require contiguous blocks of memory to
> +    operate.  They include devices such as cameras, hardware video
> +    decoders and encoders, etc.

Yes, this is becoming quite a big problem - and many ARM SoCs suffer
from the existing memory allocators being extremely inadequate for
their use.

One of the areas I've been working on is sorting out the DMA coherent
allocator so we don't violate the architecture requirements for ARMv6
and ARMv7 CPUs (which basically prohibits multiple mappings of memory
with different attributes.)

One of the ideas that I've thought about for this is to reserve an
amount of contiguous memory at boot time to fill the entire DMA coherent
mapping, marking the memory in the main kernel memory map as 'no access',
and allocate directly from the DMA coherent region.

However, discussing this with people who have the problem you're trying
to solve indicates that they do not want to set aside an amount of
memory as they perceive this to be a waste of resources.

This concern also applies to 'cma'.

> +/*
> + * Don't call it directly, use cma_alloc(), cma_alloc_from() or
> + * cma_alloc_from_region().
> + */
> +dma_addr_t __must_check
> +__cma_alloc(const struct device *dev, const char *kind,
> +	    size_t size, dma_addr_t alignment);

Does this really always return DMA-able memory (memory which can be
DMA'd to/from without DMA-mapping etc?)

As it returns a dma_addr_t, it's returning a cookie for the memory which
will be suitable for writing directly to the device 'dev' doing the DMA.
(NB: DMA addresses may not be the same as physical addresses, especially
if the device is on a downstream bus.  We have ARM platforms which have
different bus offsets.)

How does one obtain the CPU address of this memory in order for the CPU
to access it?

> +static inline dma_addr_t __must_check
> +cma_alloc(const struct device *dev, const char *kind,
> +	  size_t size, dma_addr_t alignment)
> +{
> +	return dev ? -EINVAL : __cma_alloc(dev, kind, size, alignment);

So I can't use this to allocate memory for anything but a NULL device?

> +static inline int
> +cma_info(struct cma_info *info, const struct device *dev, const char *kind)
> +{
> +	return dev ? -EINVAL : __cma_info(info, dev, kind);

This won't return information for anything but a NULL device?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-27 12:08       ` Russell King - ARM Linux
  0 siblings, 0 replies; 57+ messages in thread
From: Russell King - ARM Linux @ 2010-07-27 12:08 UTC (permalink / raw)
  To: Michal Nazarewicz
  Cc: linux-mm, Daniel Walker, Jonathan Corbet, Pawel Osciak,
	Mark Brown, linux-kernel, Hiremath Vaibhav, FUJITA Tomonori,
	Kyungmin Park, Zach Pfeffer, linux-media, linux-arm-kernel,
	Marek Szyprowski

On Mon, Jul 26, 2010 at 04:40:30PM +0200, Michal Nazarewicz wrote:
> +** Why is it needed?
> +
> +    Various devices on embedded systems have no scatter-getter and/or
> +    IO map support and as such require contiguous blocks of memory to
> +    operate.  They include devices such as cameras, hardware video
> +    decoders and encoders, etc.

Yes, this is becoming quite a big problem - and many ARM SoCs suffer
from the existing memory allocators being extremely inadequate for
their use.

One of the areas I've been working on is sorting out the DMA coherent
allocator so we don't violate the architecture requirements for ARMv6
and ARMv7 CPUs (which basically prohibits multiple mappings of memory
with different attributes.)

One of the ideas that I've thought about for this is to reserve an
amount of contiguous memory at boot time to fill the entire DMA coherent
mapping, marking the memory in the main kernel memory map as 'no access',
and allocate directly from the DMA coherent region.

However, discussing this with people who have the problem you're trying
to solve indicates that they do not want to set aside an amount of
memory as they perceive this to be a waste of resources.

This concern also applies to 'cma'.

> +/*
> + * Don't call it directly, use cma_alloc(), cma_alloc_from() or
> + * cma_alloc_from_region().
> + */
> +dma_addr_t __must_check
> +__cma_alloc(const struct device *dev, const char *kind,
> +	    size_t size, dma_addr_t alignment);

Does this really always return DMA-able memory (memory which can be
DMA'd to/from without DMA-mapping etc?)

As it returns a dma_addr_t, it's returning a cookie for the memory which
will be suitable for writing directly to the device 'dev' doing the DMA.
(NB: DMA addresses may not be the same as physical addresses, especially
if the device is on a downstream bus.  We have ARM platforms which have
different bus offsets.)

How does one obtain the CPU address of this memory in order for the CPU
to access it?

> +static inline dma_addr_t __must_check
> +cma_alloc(const struct device *dev, const char *kind,
> +	  size_t size, dma_addr_t alignment)
> +{
> +	return dev ? -EINVAL : __cma_alloc(dev, kind, size, alignment);

So I can't use this to allocate memory for anything but a NULL device?

> +static inline int
> +cma_info(struct cma_info *info, const struct device *dev, const char *kind)
> +{
> +	return dev ? -EINVAL : __cma_info(info, dev, kind);

This won't return information for anything but a NULL device?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-27 12:08       ` Russell King - ARM Linux
  0 siblings, 0 replies; 57+ messages in thread
From: Russell King - ARM Linux @ 2010-07-27 12:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jul 26, 2010 at 04:40:30PM +0200, Michal Nazarewicz wrote:
> +** Why is it needed?
> +
> +    Various devices on embedded systems have no scatter-getter and/or
> +    IO map support and as such require contiguous blocks of memory to
> +    operate.  They include devices such as cameras, hardware video
> +    decoders and encoders, etc.

Yes, this is becoming quite a big problem - and many ARM SoCs suffer
from the existing memory allocators being extremely inadequate for
their use.

One of the areas I've been working on is sorting out the DMA coherent
allocator so we don't violate the architecture requirements for ARMv6
and ARMv7 CPUs (which basically prohibits multiple mappings of memory
with different attributes.)

One of the ideas that I've thought about for this is to reserve an
amount of contiguous memory at boot time to fill the entire DMA coherent
mapping, marking the memory in the main kernel memory map as 'no access',
and allocate directly from the DMA coherent region.

However, discussing this with people who have the problem you're trying
to solve indicates that they do not want to set aside an amount of
memory as they perceive this to be a waste of resources.

This concern also applies to 'cma'.

> +/*
> + * Don't call it directly, use cma_alloc(), cma_alloc_from() or
> + * cma_alloc_from_region().
> + */
> +dma_addr_t __must_check
> +__cma_alloc(const struct device *dev, const char *kind,
> +	    size_t size, dma_addr_t alignment);

Does this really always return DMA-able memory (memory which can be
DMA'd to/from without DMA-mapping etc?)

As it returns a dma_addr_t, it's returning a cookie for the memory which
will be suitable for writing directly to the device 'dev' doing the DMA.
(NB: DMA addresses may not be the same as physical addresses, especially
if the device is on a downstream bus.  We have ARM platforms which have
different bus offsets.)

How does one obtain the CPU address of this memory in order for the CPU
to access it?

> +static inline dma_addr_t __must_check
> +cma_alloc(const struct device *dev, const char *kind,
> +	  size_t size, dma_addr_t alignment)
> +{
> +	return dev ? -EINVAL : __cma_alloc(dev, kind, size, alignment);

So I can't use this to allocate memory for anything but a NULL device?

> +static inline int
> +cma_info(struct cma_info *info, const struct device *dev, const char *kind)
> +{
> +	return dev ? -EINVAL : __cma_info(info, dev, kind);

This won't return information for anything but a NULL device?

^ permalink raw reply	[flat|nested] 57+ messages in thread

* RE: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
  2010-07-27 12:08       ` Russell King - ARM Linux
  (?)
@ 2010-07-27 12:45         ` Marek Szyprowski
  -1 siblings, 0 replies; 57+ messages in thread
From: Marek Szyprowski @ 2010-07-27 12:45 UTC (permalink / raw)
  To: 'Russell King - ARM Linux', Michal Nazarewicz
  Cc: linux-mm, 'Daniel Walker', 'Jonathan Corbet',
	Pawel Osciak, 'Mark Brown',
	linux-kernel, 'Hiremath Vaibhav',
	'FUJITA Tomonori', 'Kyungmin Park',
	'Zach Pfeffer',
	linux-media, linux-arm-kernel

Hello,

On Tuesday, July 27, 2010 2:09 PM Russell King - ARM Linux wrote:

> On Mon, Jul 26, 2010 at 04:40:30PM +0200, Michal Nazarewicz wrote:
> > +** Why is it needed?
> > +
> > +    Various devices on embedded systems have no scatter-getter and/or
> > +    IO map support and as such require contiguous blocks of memory to
> > +    operate.  They include devices such as cameras, hardware video
> > +    decoders and encoders, etc.
> 
> Yes, this is becoming quite a big problem - and many ARM SoCs suffer
> from the existing memory allocators being extremely inadequate for
> their use.
> 
> One of the areas I've been working on is sorting out the DMA coherent
> allocator so we don't violate the architecture requirements for ARMv6
> and ARMv7 CPUs (which basically prohibits multiple mappings of memory
> with different attributes.)
> 
> One of the ideas that I've thought about for this is to reserve an
> amount of contiguous memory at boot time to fill the entire DMA coherent
> mapping, marking the memory in the main kernel memory map as 'no access',
> and allocate directly from the DMA coherent region.
> 
> However, discussing this with people who have the problem you're trying
> to solve indicates that they do not want to set aside an amount of
> memory as they perceive this to be a waste of resources.

Assuming your board have only 128MB of physical memory (quite common case
for some embedded boards), leaving 16MB unused just for DMA coherent
area is a huge waste imho.

> This concern also applies to 'cma'.

Yes, we know. We plan to recover some of that 'wasted' memory by providing
a way to allocate some kind of virtual swap device on it. This is just an
idea, no related works has been started yet.

> 
> > +/*
> > + * Don't call it directly, use cma_alloc(), cma_alloc_from() or
> > + * cma_alloc_from_region().
> > + */
> > +dma_addr_t __must_check
> > +__cma_alloc(const struct device *dev, const char *kind,
> > +	    size_t size, dma_addr_t alignment);
> 
> Does this really always return DMA-able memory (memory which can be
> DMA'd to/from without DMA-mapping etc?)
> 
> As it returns a dma_addr_t, it's returning a cookie for the memory which
> will be suitable for writing directly to the device 'dev' doing the DMA.
> (NB: DMA addresses may not be the same as physical addresses, especially
> if the device is on a downstream bus.  We have ARM platforms which have
> different bus offsets.)
> 
> How does one obtain the CPU address of this memory in order for the CPU
> to access it?

Right, we did not cover such case. In CMA approach we tried to separate
memory allocation from the memory mapping into user/kernel space. Mapping
a buffer is much more complicated process that cannot be handled in a
generic way, so we decided to leave this for the device drivers. Usually
video processing devices also don't need in-kernel mapping for such
buffers at all.

Best regards
--
Marek Szyprowski
Samsung Poland R&D Center



^ permalink raw reply	[flat|nested] 57+ messages in thread

* RE: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-27 12:45         ` Marek Szyprowski
  0 siblings, 0 replies; 57+ messages in thread
From: Marek Szyprowski @ 2010-07-27 12:45 UTC (permalink / raw)
  To: 'Russell King - ARM Linux', Michal Nazarewicz
  Cc: linux-mm, 'Daniel Walker', 'Jonathan Corbet',
	Pawel Osciak, 'Mark Brown',
	linux-kernel, 'Hiremath Vaibhav',
	'FUJITA Tomonori', 'Kyungmin Park',
	'Zach Pfeffer',
	linux-media, linux-arm-kernel

Hello,

On Tuesday, July 27, 2010 2:09 PM Russell King - ARM Linux wrote:

> On Mon, Jul 26, 2010 at 04:40:30PM +0200, Michal Nazarewicz wrote:
> > +** Why is it needed?
> > +
> > +    Various devices on embedded systems have no scatter-getter and/or
> > +    IO map support and as such require contiguous blocks of memory to
> > +    operate.  They include devices such as cameras, hardware video
> > +    decoders and encoders, etc.
> 
> Yes, this is becoming quite a big problem - and many ARM SoCs suffer
> from the existing memory allocators being extremely inadequate for
> their use.
> 
> One of the areas I've been working on is sorting out the DMA coherent
> allocator so we don't violate the architecture requirements for ARMv6
> and ARMv7 CPUs (which basically prohibits multiple mappings of memory
> with different attributes.)
> 
> One of the ideas that I've thought about for this is to reserve an
> amount of contiguous memory at boot time to fill the entire DMA coherent
> mapping, marking the memory in the main kernel memory map as 'no access',
> and allocate directly from the DMA coherent region.
> 
> However, discussing this with people who have the problem you're trying
> to solve indicates that they do not want to set aside an amount of
> memory as they perceive this to be a waste of resources.

Assuming your board have only 128MB of physical memory (quite common case
for some embedded boards), leaving 16MB unused just for DMA coherent
area is a huge waste imho.

> This concern also applies to 'cma'.

Yes, we know. We plan to recover some of that 'wasted' memory by providing
a way to allocate some kind of virtual swap device on it. This is just an
idea, no related works has been started yet.

> 
> > +/*
> > + * Don't call it directly, use cma_alloc(), cma_alloc_from() or
> > + * cma_alloc_from_region().
> > + */
> > +dma_addr_t __must_check
> > +__cma_alloc(const struct device *dev, const char *kind,
> > +	    size_t size, dma_addr_t alignment);
> 
> Does this really always return DMA-able memory (memory which can be
> DMA'd to/from without DMA-mapping etc?)
> 
> As it returns a dma_addr_t, it's returning a cookie for the memory which
> will be suitable for writing directly to the device 'dev' doing the DMA.
> (NB: DMA addresses may not be the same as physical addresses, especially
> if the device is on a downstream bus.  We have ARM platforms which have
> different bus offsets.)
> 
> How does one obtain the CPU address of this memory in order for the CPU
> to access it?

Right, we did not cover such case. In CMA approach we tried to separate
memory allocation from the memory mapping into user/kernel space. Mapping
a buffer is much more complicated process that cannot be handled in a
generic way, so we decided to leave this for the device drivers. Usually
video processing devices also don't need in-kernel mapping for such
buffers at all.

Best regards
--
Marek Szyprowski
Samsung Poland R&D Center


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-27 12:45         ` Marek Szyprowski
  0 siblings, 0 replies; 57+ messages in thread
From: Marek Szyprowski @ 2010-07-27 12:45 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Tuesday, July 27, 2010 2:09 PM Russell King - ARM Linux wrote:

> On Mon, Jul 26, 2010 at 04:40:30PM +0200, Michal Nazarewicz wrote:
> > +** Why is it needed?
> > +
> > +    Various devices on embedded systems have no scatter-getter and/or
> > +    IO map support and as such require contiguous blocks of memory to
> > +    operate.  They include devices such as cameras, hardware video
> > +    decoders and encoders, etc.
> 
> Yes, this is becoming quite a big problem - and many ARM SoCs suffer
> from the existing memory allocators being extremely inadequate for
> their use.
> 
> One of the areas I've been working on is sorting out the DMA coherent
> allocator so we don't violate the architecture requirements for ARMv6
> and ARMv7 CPUs (which basically prohibits multiple mappings of memory
> with different attributes.)
> 
> One of the ideas that I've thought about for this is to reserve an
> amount of contiguous memory at boot time to fill the entire DMA coherent
> mapping, marking the memory in the main kernel memory map as 'no access',
> and allocate directly from the DMA coherent region.
> 
> However, discussing this with people who have the problem you're trying
> to solve indicates that they do not want to set aside an amount of
> memory as they perceive this to be a waste of resources.

Assuming your board have only 128MB of physical memory (quite common case
for some embedded boards), leaving 16MB unused just for DMA coherent
area is a huge waste imho.

> This concern also applies to 'cma'.

Yes, we know. We plan to recover some of that 'wasted' memory by providing
a way to allocate some kind of virtual swap device on it. This is just an
idea, no related works has been started yet.

> 
> > +/*
> > + * Don't call it directly, use cma_alloc(), cma_alloc_from() or
> > + * cma_alloc_from_region().
> > + */
> > +dma_addr_t __must_check
> > +__cma_alloc(const struct device *dev, const char *kind,
> > +	    size_t size, dma_addr_t alignment);
> 
> Does this really always return DMA-able memory (memory which can be
> DMA'd to/from without DMA-mapping etc?)
> 
> As it returns a dma_addr_t, it's returning a cookie for the memory which
> will be suitable for writing directly to the device 'dev' doing the DMA.
> (NB: DMA addresses may not be the same as physical addresses, especially
> if the device is on a downstream bus.  We have ARM platforms which have
> different bus offsets.)
> 
> How does one obtain the CPU address of this memory in order for the CPU
> to access it?

Right, we did not cover such case. In CMA approach we tried to separate
memory allocation from the memory mapping into user/kernel space. Mapping
a buffer is much more complicated process that cannot be handled in a
generic way, so we decided to leave this for the device drivers. Usually
video processing devices also don't need in-kernel mapping for such
buffers at all.

Best regards
--
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
  2010-07-27 12:45         ` Marek Szyprowski
  (?)
@ 2010-07-27 12:58           ` Jonathan Corbet
  -1 siblings, 0 replies; 57+ messages in thread
From: Jonathan Corbet @ 2010-07-27 12:58 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: 'Russell King - ARM Linux',
	Michal Nazarewicz, linux-mm, 'Daniel Walker',
	Pawel Osciak, 'Mark Brown',
	linux-kernel, 'Hiremath Vaibhav',
	'FUJITA Tomonori', 'Kyungmin Park',
	'Zach Pfeffer',
	linux-media, linux-arm-kernel

On Tue, 27 Jul 2010 14:45:58 +0200
Marek Szyprowski <m.szyprowski@samsung.com> wrote:

> > How does one obtain the CPU address of this memory in order for the CPU
> > to access it?  
> 
> Right, we did not cover such case. In CMA approach we tried to separate
> memory allocation from the memory mapping into user/kernel space. Mapping
> a buffer is much more complicated process that cannot be handled in a
> generic way, so we decided to leave this for the device drivers. Usually
> video processing devices also don't need in-kernel mapping for such
> buffers at all.

Still...that *is* why I suggested an interface which would return both
the DMA address and a kernel-space virtual address, just like the DMA
API does...  Either that, or just return the void * kernel address and
let drivers do the DMA mapping themselves.  Returning only the
dma_addr_t address will make the interface difficult to use in many
situations.

jon

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-27 12:58           ` Jonathan Corbet
  0 siblings, 0 replies; 57+ messages in thread
From: Jonathan Corbet @ 2010-07-27 12:58 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: 'Russell King - ARM Linux',
	Michal Nazarewicz, linux-mm, 'Daniel Walker',
	Pawel Osciak, 'Mark Brown',
	linux-kernel, 'Hiremath Vaibhav',
	'FUJITA Tomonori', 'Kyungmin Park',
	'Zach Pfeffer',
	linux-media, linux-arm-kernel

On Tue, 27 Jul 2010 14:45:58 +0200
Marek Szyprowski <m.szyprowski@samsung.com> wrote:

> > How does one obtain the CPU address of this memory in order for the CPU
> > to access it?  
> 
> Right, we did not cover such case. In CMA approach we tried to separate
> memory allocation from the memory mapping into user/kernel space. Mapping
> a buffer is much more complicated process that cannot be handled in a
> generic way, so we decided to leave this for the device drivers. Usually
> video processing devices also don't need in-kernel mapping for such
> buffers at all.

Still...that *is* why I suggested an interface which would return both
the DMA address and a kernel-space virtual address, just like the DMA
API does...  Either that, or just return the void * kernel address and
let drivers do the DMA mapping themselves.  Returning only the
dma_addr_t address will make the interface difficult to use in many
situations.

jon

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-27 12:58           ` Jonathan Corbet
  0 siblings, 0 replies; 57+ messages in thread
From: Jonathan Corbet @ 2010-07-27 12:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 27 Jul 2010 14:45:58 +0200
Marek Szyprowski <m.szyprowski@samsung.com> wrote:

> > How does one obtain the CPU address of this memory in order for the CPU
> > to access it?  
> 
> Right, we did not cover such case. In CMA approach we tried to separate
> memory allocation from the memory mapping into user/kernel space. Mapping
> a buffer is much more complicated process that cannot be handled in a
> generic way, so we decided to leave this for the device drivers. Usually
> video processing devices also don't need in-kernel mapping for such
> buffers at all.

Still...that *is* why I suggested an interface which would return both
the DMA address and a kernel-space virtual address, just like the DMA
API does...  Either that, or just return the void * kernel address and
let drivers do the DMA mapping themselves.  Returning only the
dma_addr_t address will make the interface difficult to use in many
situations.

jon

^ permalink raw reply	[flat|nested] 57+ messages in thread

* RE: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
  2010-07-27 12:58           ` Jonathan Corbet
  (?)
@ 2010-07-27 13:46             ` Marek Szyprowski
  -1 siblings, 0 replies; 57+ messages in thread
From: Marek Szyprowski @ 2010-07-27 13:46 UTC (permalink / raw)
  To: 'Jonathan Corbet'
  Cc: 'Russell King - ARM Linux',
	Michal Nazarewicz, linux-mm, 'Daniel Walker',
	Pawel Osciak, 'Mark Brown',
	linux-kernel, 'Hiremath Vaibhav',
	'FUJITA Tomonori', 'Kyungmin Park',
	'Zach Pfeffer',
	linux-media, linux-arm-kernel

Hello,

On Tuesday, July 27, 2010 2:59 PM Jonathan Corbet wrote:

> On Tue, 27 Jul 2010 14:45:58 +0200
> Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> 
> > > How does one obtain the CPU address of this memory in order for the CPU
> > > to access it?
> >
> > Right, we did not cover such case. In CMA approach we tried to separate
> > memory allocation from the memory mapping into user/kernel space. Mapping
> > a buffer is much more complicated process that cannot be handled in a
> > generic way, so we decided to leave this for the device drivers. Usually
> > video processing devices also don't need in-kernel mapping for such
> > buffers at all.
> 
> Still...that *is* why I suggested an interface which would return both
> the DMA address and a kernel-space virtual address, just like the DMA
> API does...  Either that, or just return the void * kernel address and
> let drivers do the DMA mapping themselves.  Returning only the
> dma_addr_t address will make the interface difficult to use in many
> situations.

As I said, drivers usually don't need in-kernel mapping for video buffers.
Is there really a need for creating such mapping?

Best regards
--
Marek Szyprowski
Samsung Poland R&D Center



^ permalink raw reply	[flat|nested] 57+ messages in thread

* RE: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-27 13:46             ` Marek Szyprowski
  0 siblings, 0 replies; 57+ messages in thread
From: Marek Szyprowski @ 2010-07-27 13:46 UTC (permalink / raw)
  To: 'Jonathan Corbet'
  Cc: 'Russell King - ARM Linux',
	Michal Nazarewicz, linux-mm, 'Daniel Walker',
	Pawel Osciak, 'Mark Brown',
	linux-kernel, 'Hiremath Vaibhav',
	'FUJITA Tomonori', 'Kyungmin Park',
	'Zach Pfeffer',
	linux-media, linux-arm-kernel

Hello,

On Tuesday, July 27, 2010 2:59 PM Jonathan Corbet wrote:

> On Tue, 27 Jul 2010 14:45:58 +0200
> Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> 
> > > How does one obtain the CPU address of this memory in order for the CPU
> > > to access it?
> >
> > Right, we did not cover such case. In CMA approach we tried to separate
> > memory allocation from the memory mapping into user/kernel space. Mapping
> > a buffer is much more complicated process that cannot be handled in a
> > generic way, so we decided to leave this for the device drivers. Usually
> > video processing devices also don't need in-kernel mapping for such
> > buffers at all.
> 
> Still...that *is* why I suggested an interface which would return both
> the DMA address and a kernel-space virtual address, just like the DMA
> API does...  Either that, or just return the void * kernel address and
> let drivers do the DMA mapping themselves.  Returning only the
> dma_addr_t address will make the interface difficult to use in many
> situations.

As I said, drivers usually don't need in-kernel mapping for video buffers.
Is there really a need for creating such mapping?

Best regards
--
Marek Szyprowski
Samsung Poland R&D Center


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-27 13:46             ` Marek Szyprowski
  0 siblings, 0 replies; 57+ messages in thread
From: Marek Szyprowski @ 2010-07-27 13:46 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Tuesday, July 27, 2010 2:59 PM Jonathan Corbet wrote:

> On Tue, 27 Jul 2010 14:45:58 +0200
> Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> 
> > > How does one obtain the CPU address of this memory in order for the CPU
> > > to access it?
> >
> > Right, we did not cover such case. In CMA approach we tried to separate
> > memory allocation from the memory mapping into user/kernel space. Mapping
> > a buffer is much more complicated process that cannot be handled in a
> > generic way, so we decided to leave this for the device drivers. Usually
> > video processing devices also don't need in-kernel mapping for such
> > buffers at all.
> 
> Still...that *is* why I suggested an interface which would return both
> the DMA address and a kernel-space virtual address, just like the DMA
> API does...  Either that, or just return the void * kernel address and
> let drivers do the DMA mapping themselves.  Returning only the
> dma_addr_t address will make the interface difficult to use in many
> situations.

As I said, drivers usually don't need in-kernel mapping for video buffers.
Is there really a need for creating such mapping?

Best regards
--
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
  2010-07-27 12:58           ` Jonathan Corbet
  (?)
@ 2010-07-27 14:21             ` FUJITA Tomonori
  -1 siblings, 0 replies; 57+ messages in thread
From: FUJITA Tomonori @ 2010-07-27 14:21 UTC (permalink / raw)
  To: corbet
  Cc: m.szyprowski, linux, m.nazarewicz, linux-mm, dwalker, p.osciak,
	broonie, linux-kernel, hvaibhav, fujita.tomonori, kyungmin.park,
	zpfeffer, linux-media, linux-arm-kernel

On Tue, 27 Jul 2010 06:58:42 -0600
Jonathan Corbet <corbet@lwn.net> wrote:

> On Tue, 27 Jul 2010 14:45:58 +0200
> Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> 
> > > How does one obtain the CPU address of this memory in order for the CPU
> > > to access it?  
> > 
> > Right, we did not cover such case. In CMA approach we tried to separate
> > memory allocation from the memory mapping into user/kernel space. Mapping
> > a buffer is much more complicated process that cannot be handled in a
> > generic way, so we decided to leave this for the device drivers. Usually
> > video processing devices also don't need in-kernel mapping for such
> > buffers at all.
> 
> Still...that *is* why I suggested an interface which would return both
> the DMA address and a kernel-space virtual address, just like the DMA
> API does...  Either that, or just return the void * kernel address and

The DMA API for coherent memory (dma_alloc_coherent) returns both an
DMA address and a kernel-space virtual address because it does both
allocation and mapping.

However, other DMA API (dma_map_*) returns only an DMA address because
it does only mapping.

I think that if we need new API for coherent memory, we could
unify it with the DMA API for coherent memory.

IMO, it's cleaner to having two separate APIs for allocation and
mapping (except for coherent memory). The drivers have been working
in that way.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-27 14:21             ` FUJITA Tomonori
  0 siblings, 0 replies; 57+ messages in thread
From: FUJITA Tomonori @ 2010-07-27 14:21 UTC (permalink / raw)
  To: corbet
  Cc: m.szyprowski, linux, m.nazarewicz, linux-mm, dwalker, p.osciak,
	broonie, linux-kernel, hvaibhav, fujita.tomonori, kyungmin.park,
	zpfeffer, linux-media, linux-arm-kernel

On Tue, 27 Jul 2010 06:58:42 -0600
Jonathan Corbet <corbet@lwn.net> wrote:

> On Tue, 27 Jul 2010 14:45:58 +0200
> Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> 
> > > How does one obtain the CPU address of this memory in order for the CPU
> > > to access it?  
> > 
> > Right, we did not cover such case. In CMA approach we tried to separate
> > memory allocation from the memory mapping into user/kernel space. Mapping
> > a buffer is much more complicated process that cannot be handled in a
> > generic way, so we decided to leave this for the device drivers. Usually
> > video processing devices also don't need in-kernel mapping for such
> > buffers at all.
> 
> Still...that *is* why I suggested an interface which would return both
> the DMA address and a kernel-space virtual address, just like the DMA
> API does...  Either that, or just return the void * kernel address and

The DMA API for coherent memory (dma_alloc_coherent) returns both an
DMA address and a kernel-space virtual address because it does both
allocation and mapping.

However, other DMA API (dma_map_*) returns only an DMA address because
it does only mapping.

I think that if we need new API for coherent memory, we could
unify it with the DMA API for coherent memory.

IMO, it's cleaner to having two separate APIs for allocation and
mapping (except for coherent memory). The drivers have been working
in that way.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-27 14:21             ` FUJITA Tomonori
  0 siblings, 0 replies; 57+ messages in thread
From: FUJITA Tomonori @ 2010-07-27 14:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 27 Jul 2010 06:58:42 -0600
Jonathan Corbet <corbet@lwn.net> wrote:

> On Tue, 27 Jul 2010 14:45:58 +0200
> Marek Szyprowski <m.szyprowski@samsung.com> wrote:
> 
> > > How does one obtain the CPU address of this memory in order for the CPU
> > > to access it?  
> > 
> > Right, we did not cover such case. In CMA approach we tried to separate
> > memory allocation from the memory mapping into user/kernel space. Mapping
> > a buffer is much more complicated process that cannot be handled in a
> > generic way, so we decided to leave this for the device drivers. Usually
> > video processing devices also don't need in-kernel mapping for such
> > buffers at all.
> 
> Still...that *is* why I suggested an interface which would return both
> the DMA address and a kernel-space virtual address, just like the DMA
> API does...  Either that, or just return the void * kernel address and

The DMA API for coherent memory (dma_alloc_coherent) returns both an
DMA address and a kernel-space virtual address because it does both
allocation and mapping.

However, other DMA API (dma_map_*) returns only an DMA address because
it does only mapping.

I think that if we need new API for coherent memory, we could
unify it with the DMA API for coherent memory.

IMO, it's cleaner to having two separate APIs for allocation and
mapping (except for coherent memory). The drivers have been working
in that way.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
  2010-07-27  7:41         ` Marek Szyprowski
  (?)
@ 2010-07-27 16:27           ` Hans Verkuil
  -1 siblings, 0 replies; 57+ messages in thread
From: Hans Verkuil @ 2010-07-27 16:27 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Michal Nazarewicz, linux-mm, linux-kernel, linux-media,
	linux-arm-kernel, 'Hiremath Vaibhav',
	Pawel Osciak, 'Mark Brown', 'Daniel Walker',
	'Jonathan Corbet', 'FUJITA Tomonori',
	'Zach Pfeffer', 'Kyungmin Park'

On Tuesday 27 July 2010 09:41:40 Marek Szyprowski wrote:
> Hello,
> 
> On Monday, July 26, 2010 10:29 PM Hans Verkuil wrote:
> 
> > Hi Michal,
> > 
> > Thanks for working on this, we definitely need something along these lines.
> > 
> > On Monday 26 July 2010 16:40:30 Michal Nazarewicz wrote:
> > > The Contiguous Memory Allocator framework is a set of APIs for
> > > allocating physically contiguous chunks of memory.
> > >
> > > Various chips require contiguous blocks of memory to operate.  Those
> > > chips include devices such as cameras, hardware video decoders and
> > > encoders, etc.
> > >
> > > The code is highly modular and customisable to suit the needs of
> > > various users.  Set of regions reserved for CMA can be configured on
> > > run-time and it is easy to add custom allocator algorithms if one
> > > has such need.
> > >
> > > For more details see Documentation/contiguous-memory.txt.
> > >
> > > Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
> > > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> > > Reviewed-by: Pawel Osciak <p.osciak@samsung.com>
> > > ---
> > >  Documentation/00-INDEX                             |    2 +
> > >  .../ABI/testing/sysfs-kernel-mm-contiguous         |    9 +
> > >  Documentation/contiguous-memory.txt                |  646 +++++++++++
> > >  Documentation/kernel-parameters.txt                |    4 +
> > >  include/linux/cma.h                                |  445 ++++++++
> > >  mm/Kconfig                                         |   34 +
> > >  mm/Makefile                                        |    3 +
> > >  mm/cma-best-fit.c                                  |  407 +++++++
> > >  mm/cma.c                                           | 1170
> > ++++++++++++++++++++
> > >  9 files changed, 2720 insertions(+), 0 deletions(-)
> > >  create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > >  create mode 100644 Documentation/contiguous-memory.txt
> > >  create mode 100644 include/linux/cma.h
> > >  create mode 100644 mm/cma-best-fit.c
> > >  create mode 100644 mm/cma.c
> > >
> > > diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
> > > index 5405f7a..bb50209 100644
> > > --- a/Documentation/00-INDEX
> > > +++ b/Documentation/00-INDEX
> > > @@ -94,6 +94,8 @@ connector/
> > >  	- docs on the netlink based userspace<->kernel space communication
> > mod.
> > >  console/
> > >  	- documentation on Linux console drivers.
> > > +contiguous-memory.txt
> > > +	- documentation on physically-contiguous memory allocation framework.
> > >  cpu-freq/
> > >  	- info on CPU frequency and voltage scaling.
> > >  cpu-hotplug.txt
> > > diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > > new file mode 100644
> > > index 0000000..05e2f6a
> > > --- /dev/null
> > > +++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > > @@ -0,0 +1,9 @@
> > > +What:		/sys/kernel/mm/contiguous/
> > > +Date:		July 2008
> > > +Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
> > > +Description:
> > > +		/sys/kernel/mm/contiguous/ contains two files: asterisk and
> > > +		map.  They are used to configure the Contiguous Memory
> > > +		Allocator framework.
> > > +
> > > +		For details see Documentation/contiguous-memory.txt.
> > > diff --git a/Documentation/contiguous-memory.txt
> > b/Documentation/contiguous-memory.txt
> > > new file mode 100644
> > > index 0000000..6eb1295
> > > --- /dev/null
> > > +++ b/Documentation/contiguous-memory.txt
> > > @@ -0,0 +1,646 @@
> > > +                                                             -*- org -*-
> > > +
> > > +* Contiguous Memory Allocator
> > > +
> > > +   The Contiguous Memory Allocator (CMA) is a framework, which allows
> > > +   setting up a machine-specific configuration for physically-contiguous
> > > +   memory management. Memory for devices is then allocated according
> > > +   to that configuration.
> > > +
> > > +   The main role of the framework is not to allocate memory, but to
> > > +   parse and manage memory configurations, as well as to act as an
> > > +   in-between between device drivers and pluggable allocators. It is
> > > +   thus not tied to any memory allocation method or strategy.
> > > +
> > > +** Why is it needed?
> > > +
> > > +    Various devices on embedded systems have no scatter-getter and/or
> > > +    IO map support and as such require contiguous blocks of memory to
> > > +    operate.  They include devices such as cameras, hardware video
> > > +    decoders and encoders, etc.
> > > +
> > > +    Such devices often require big memory buffers (a full HD frame is,
> > > +    for instance, more then 2 mega pixels large, i.e. more than 6 MB
> > > +    of memory), which makes mechanisms such as kmalloc() ineffective.
> > > +
> > > +    Some embedded devices impose additional requirements on the
> > > +    buffers, e.g. they can operate only on buffers allocated in
> > > +    particular location/memory bank (if system has more than one
> > > +    memory bank) or buffers aligned to a particular memory boundary.
> > > +
> > > +    Development of embedded devices have seen a big rise recently
> > > +    (especially in the V4L area) and many such drivers include their
> > > +    own memory allocation code. Most of them use bootmem-based methods.
> > > +    CMA framework is an attempt to unify contiguous memory allocation
> > > +    mechanisms and provide a simple API for device drivers, while
> > > +    staying as customisable and modular as possible.
> > > +
> > > +** Design
> > > +
> > > +    The main design goal for the CMA was to provide a customisable and
> > > +    modular framework, which could be configured to suit the needs of
> > > +    individual systems.  Configuration specifies a list of memory
> > > +    regions, which then are assigned to devices.  Memory regions can
> > > +    be shared among many device drivers or assigned exclusively to
> > > +    one.  This has been achieved in the following ways:
> > 
> > OK, I like the idea of regions, i.e. defining memory areas with specific
> > properties or uses.
> > 
> > But why should it be possible to define regions through kernel parameters?
> > Regions are typically fixed for a particular platform and can be setup in
> > the
> > platform specific code. Actually, one region could be setup by default:
> > DMA-able memory. That would be very handy in fact for many PCI-based TV
> > capture drivers.
> 
> IMHO this is a just desktop-point-of-view. In embedded world things are
> a bit different. Most SoCs have a some kind of common system memory and
> usually all build-in peripherals are able to DMA to any part of it (there is
> no DMA specific hardware zone).

There are still desktops in the world, you know :-) It's nice if cma can also
be used there. Having a region called 'dma' or something like that would be
useful for drivers. And yes, for most embedded systems that would be the
equivalent of common memory.
 
> > I think that the only thing that you want to set in the kernel params is
> > the size of each region.
> 
> Keeping it as a kernel parameter is very handy for development. But I agree
> that we might make it dependent on some Kconfig entry. This way a platform
> setup code would provide default region description just as an array of the
> region structures and we will get rid of the parsing code in the release
> versions.

I think that would help, yes.
  
> > The same with assigning regions to drivers: why would you want to do that?
> > The driver should know which regions it can use (with possible fallbacks).
> 
> I'm sorry, but this is again a little 'desktop-centric point-of-view'. On
> desktop it is perfectly acceptable to have a separate memory region for each
> device. In embedded world memory is a precious resource. Of course we can go
> the 'separate memory region for each device' way, but we observed that at
> least some memory can be recovered if we decide to share memory regions for
> some of the devices.
> 
> Assigning regions to the drivers is a way to describe how memory can be
> shared. This is something that is independent from the actual drivers. 
> Device drivers cannot and mustn't have such knowledge. 

You misunderstand me. I wasn't promoting separate memory regions for each device.
Quite the opposite: initially I would make regions shared by default. I actually
do not see why device drivers can't have the knowledge to allocate their memory
from particular regions. We are talking about embedded systems where these types
of memory allocations tend to be well defined. E.g. the video capture driver needs
to allocate X video buffers at boot time. For most devices that will be the dma
region, in the case of samsung devices that will be regions for particular memory
banks.

The only time reserving regions for specific devices might be necessary is if
you can get memory fragmentation if multiple drivers use the same region. But if
drivers just allocate the memory at boot, then shared regions work fine.

> 
> > And it can know that provided regions are setup by the platform code and
> > not created dynamically. This will simplify things enormously.
> > 
> > > +    1. The core of the CMA does not handle allocation of memory and
> > > +       management of free space.  Dedicated allocators are used for
> > > +       that purpose.
> > > +
> > > +       This way, if the provided solution does not match demands
> > > +       imposed on a given system, one can develop a new algorithm and
> > > +       easily plug it into the CMA framework.
> > > +
> > > +       The presented solution includes an implementation of a best-fit
> > > +       algorithm.
> > 
> > Again, do we really need user-settable per-region allocators? Just provide
> > one with the option to later choose others through the kernel Kconfig files.
> 
> From our experience, yes. Different allocators can cope with different memory
> usage scenarios better or worse. This results in higher or lower memory
> fragmentation. System use cases are something that kernel or drivers are
> definitely not aware, so only user space can tune this parameter to get the
> best possible system behavior.

Can you describe some of those usage scenarios? The cases I am familiar with
are the standard: 'need to allocate X buffers of Y MB memory each on boot' and
these buffers are only freed when the module is unloaded. While you can get
fragmentation in that case if you creatively load and unload modules, this is
not the normal usage in a deployed system.

We already have several 'normal' memory allocators (slab, slub, slob, sleb,
whatever. I've lost track there.) and I'm not very enthusiastic about the
idea of adding another bunch of them. Especially not in a first release.

Keep it simple. Just fix the core problem first, even if you are not yet able
to squeeze the last byte of memory from your system. Later you can always add
features as needed and optimize it further.
 
> > We can always add more complex scenarios later, but for an initial version
> > I'd keep it simple.
> > 
> > > +
> > > +    2. CMA allows a run-time configuration of the memory regions it
> > > +       will use to allocate chunks of memory from.  The set of memory
> > > +       regions is given on command line so it can be easily changed
> > > +       without the need for recompiling the kernel.
> > > +
> > > +       Each region has it's own size, alignment demand, a start
> > > +       address (physical address where it should be placed) and an
> > > +       allocator algorithm assigned to the region.
> > > +
> > > +       This means that there can be different algorithms running at
> > > +       the same time, if different devices on the platform have
> > > +       distinct memory usage characteristics and different algorithm
> > > +       match those the best way.
> > 
> > Seems overengineering to me. Just ensure that the code can be extended
> > later to such hypothetical scenarios. They are hypothetical, right?
> 
> Not really. Having the possibility to reconfigure memory configuration
> without kernel recompilation is very handy when one is tuning the
> configuration for the specific use case.
> 
> > > +    3. When requesting memory, devices have to introduce themselves.
> > > +       This way CMA knows who the memory is allocated for.  This
> > > +       allows for the system architect to specify which memory regions
> > > +       each device should use.
> > > +
> > > +       3a. Devices can also specify a "kind" of memory they want.
> > > +           This makes it possible to configure the system in such
> > > +           a way, that a single device may get memory from different
> > > +           memory regions, depending on the "kind" of memory it
> > > +           requested.  For example, a video codec driver might want to
> > > +           allocate some shared buffers from the first memory bank and
> > > +           the other from the second to get the highest possible
> > > +           memory throughput.
> > 
> > Not sure I understand this. Isn't this just two regions, one for each
> > memory bank,
> > and the driver requests some buffers from one region and some from the
> > other?
> 
> Right.
> 
> > Not sure how a 'kind of memory' features in this.
> 
> This 'kind' is a just cookie or a label used by the driver to distinguish
> requests for both memory banks. This functionality is essential for our
> hardware (just for hardware video codec we have 3 'kinds' of memory: memory
> bank A, memory bank B and special region for the firmware).

Let me see if I understand: driver D needs memory from bank A, so it specifies
e.g. label "a" when requesting memory. And the kernel parameters are then supposed
to define a region for memory bank A (say region "bank-a" and tell the cma that
requests from driver D for memory of kind "a" should go to region "bank-a".

Of course, the driver can also just request memory from the platform-defined
region "bank-a" directly rather than having to rely on userspace provided
parameters.
 
> > > +    4. For greater flexibility and extensibility, the framework allows
> > > +       device drivers to register private regions of reserved memory
> > > +       which then may be used only by them.
> > > +
> > > +       As an effect, if a driver would not use the rest of the CMA
> > > +       interface, it can still use CMA allocators and other
> > > +       mechanisms.
> > 
> > Why would you? Is there an actual driver that will need this?
> 
> This feature has been added after posting v1 of this rfc/patch. Jonathan 
> Corbet suggested in <http://article.gmane.org/gmane.linux.kernel.mm/50689> 
> that viafb driver might register its own private memory and use cma just
> as an allocator. IMHO this is a good idea, this way we might remove a bunch
> of custom allocators from the drivers (yes, there are such all over the
> kernel).

It goes back to whether we want per-device (or per-driver) regions or just
have global regions. What I have seen in practice is that these drivers just
need X amount of contiguous memory on boot. Having just a single region (as
it will be for most systems) to carve the buffers from is just as efficient
if not more than creating separate regions for each driver. Only if you
start freeing and reallocating memory later on will you get into trouble.

But if you do that, then you are trying to duplicate the behavior of the
normal allocators in my opinion. I really don't think we want to go there.

> 
> > > +       4a. Early in boot process, device drivers can also request the
> > > +           CMA framework to a reserve a region of memory for them
> > > +           which then will be used as a private region.
> > > +
> > > +           This way, drivers do not need to directly call bootmem,
> > > +           memblock or similar early allocator but merely register an
> > > +           early region and the framework will handle the rest
> > > +           including choosing the right early allocator.
> > 
> > The whole concept of private regions seems unnecessary to me.
> > 
> > <big snip>
> > 
> > It looks to me as if you tried to think of all possible hypothetical
> > situations
> > and write a framework for that. Of course, you may know more than I do, and
> > some of these situations actually happen.
> 
> Not exactly. We tried to design a solution that would cover all requirements
> for OUR (quite specific) embedded hardware. However we didn't want to tie it
> only to our platform. We just generalized most of our requirements so they can
> be reused for other systems.
> 
> > The basic design ideas are solid, I think. But you should get rid of all
> > the fancy features and go back to basics. We can always add those features
> > later should that become necessary. But removing features is much, much
> harder.
> 
> Well, please keep in mind that we cannot remove features that are essential
> for our solution. We know that a simple framework have some advantages
> (well, the most important one is the fact that it is easy to understand),
> but making it too simple would render it useless from our point of view
> (if it would not provide functionality required by our drivers and hardware).

Why not give a list of the requirements that your hardware has? The only
requirement unique to your hardware that I am aware of is the need for buffers
in specific memory banks (and apparently a special memory area for firmware).

So besides a 'dma' region (which is probably sufficient for most systems)
you would also have a 'banka' and 'bankb' region and perhaps a 'fw' region.

In the kernel parameters the user can specify the sizes for these regions
that are reserved at boot time and drivers will use cma to get their memory
from the relevant region using a simple allocator.

If there are other requirements in your hardware (or other hardware) that
should be taken into account for an initial version of the cma, then I'm
interested in hearing about it.

One note: since the first version of the patch wasn't posted to linux-media
I didn't follow the replies to that. If there are things in that thread that
I should read, then just mail me some links.

Regards,

	Hans

-- 
Hans Verkuil - video4linux developer - sponsored by TANDBERG, part of Cisco

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-27 16:27           ` Hans Verkuil
  0 siblings, 0 replies; 57+ messages in thread
From: Hans Verkuil @ 2010-07-27 16:27 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Michal Nazarewicz, linux-mm, linux-kernel, linux-media,
	linux-arm-kernel, 'Hiremath Vaibhav',
	Pawel Osciak, 'Mark Brown', 'Daniel Walker',
	'Jonathan Corbet', 'FUJITA Tomonori',
	'Zach Pfeffer', 'Kyungmin Park'

On Tuesday 27 July 2010 09:41:40 Marek Szyprowski wrote:
> Hello,
> 
> On Monday, July 26, 2010 10:29 PM Hans Verkuil wrote:
> 
> > Hi Michal,
> > 
> > Thanks for working on this, we definitely need something along these lines.
> > 
> > On Monday 26 July 2010 16:40:30 Michal Nazarewicz wrote:
> > > The Contiguous Memory Allocator framework is a set of APIs for
> > > allocating physically contiguous chunks of memory.
> > >
> > > Various chips require contiguous blocks of memory to operate.  Those
> > > chips include devices such as cameras, hardware video decoders and
> > > encoders, etc.
> > >
> > > The code is highly modular and customisable to suit the needs of
> > > various users.  Set of regions reserved for CMA can be configured on
> > > run-time and it is easy to add custom allocator algorithms if one
> > > has such need.
> > >
> > > For more details see Documentation/contiguous-memory.txt.
> > >
> > > Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
> > > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> > > Reviewed-by: Pawel Osciak <p.osciak@samsung.com>
> > > ---
> > >  Documentation/00-INDEX                             |    2 +
> > >  .../ABI/testing/sysfs-kernel-mm-contiguous         |    9 +
> > >  Documentation/contiguous-memory.txt                |  646 +++++++++++
> > >  Documentation/kernel-parameters.txt                |    4 +
> > >  include/linux/cma.h                                |  445 ++++++++
> > >  mm/Kconfig                                         |   34 +
> > >  mm/Makefile                                        |    3 +
> > >  mm/cma-best-fit.c                                  |  407 +++++++
> > >  mm/cma.c                                           | 1170
> > ++++++++++++++++++++
> > >  9 files changed, 2720 insertions(+), 0 deletions(-)
> > >  create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > >  create mode 100644 Documentation/contiguous-memory.txt
> > >  create mode 100644 include/linux/cma.h
> > >  create mode 100644 mm/cma-best-fit.c
> > >  create mode 100644 mm/cma.c
> > >
> > > diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
> > > index 5405f7a..bb50209 100644
> > > --- a/Documentation/00-INDEX
> > > +++ b/Documentation/00-INDEX
> > > @@ -94,6 +94,8 @@ connector/
> > >  	- docs on the netlink based userspace<->kernel space communication
> > mod.
> > >  console/
> > >  	- documentation on Linux console drivers.
> > > +contiguous-memory.txt
> > > +	- documentation on physically-contiguous memory allocation framework.
> > >  cpu-freq/
> > >  	- info on CPU frequency and voltage scaling.
> > >  cpu-hotplug.txt
> > > diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > > new file mode 100644
> > > index 0000000..05e2f6a
> > > --- /dev/null
> > > +++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > > @@ -0,0 +1,9 @@
> > > +What:		/sys/kernel/mm/contiguous/
> > > +Date:		July 2008
> > > +Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
> > > +Description:
> > > +		/sys/kernel/mm/contiguous/ contains two files: asterisk and
> > > +		map.  They are used to configure the Contiguous Memory
> > > +		Allocator framework.
> > > +
> > > +		For details see Documentation/contiguous-memory.txt.
> > > diff --git a/Documentation/contiguous-memory.txt
> > b/Documentation/contiguous-memory.txt
> > > new file mode 100644
> > > index 0000000..6eb1295
> > > --- /dev/null
> > > +++ b/Documentation/contiguous-memory.txt
> > > @@ -0,0 +1,646 @@
> > > +                                                             -*- org -*-
> > > +
> > > +* Contiguous Memory Allocator
> > > +
> > > +   The Contiguous Memory Allocator (CMA) is a framework, which allows
> > > +   setting up a machine-specific configuration for physically-contiguous
> > > +   memory management. Memory for devices is then allocated according
> > > +   to that configuration.
> > > +
> > > +   The main role of the framework is not to allocate memory, but to
> > > +   parse and manage memory configurations, as well as to act as an
> > > +   in-between between device drivers and pluggable allocators. It is
> > > +   thus not tied to any memory allocation method or strategy.
> > > +
> > > +** Why is it needed?
> > > +
> > > +    Various devices on embedded systems have no scatter-getter and/or
> > > +    IO map support and as such require contiguous blocks of memory to
> > > +    operate.  They include devices such as cameras, hardware video
> > > +    decoders and encoders, etc.
> > > +
> > > +    Such devices often require big memory buffers (a full HD frame is,
> > > +    for instance, more then 2 mega pixels large, i.e. more than 6 MB
> > > +    of memory), which makes mechanisms such as kmalloc() ineffective.
> > > +
> > > +    Some embedded devices impose additional requirements on the
> > > +    buffers, e.g. they can operate only on buffers allocated in
> > > +    particular location/memory bank (if system has more than one
> > > +    memory bank) or buffers aligned to a particular memory boundary.
> > > +
> > > +    Development of embedded devices have seen a big rise recently
> > > +    (especially in the V4L area) and many such drivers include their
> > > +    own memory allocation code. Most of them use bootmem-based methods.
> > > +    CMA framework is an attempt to unify contiguous memory allocation
> > > +    mechanisms and provide a simple API for device drivers, while
> > > +    staying as customisable and modular as possible.
> > > +
> > > +** Design
> > > +
> > > +    The main design goal for the CMA was to provide a customisable and
> > > +    modular framework, which could be configured to suit the needs of
> > > +    individual systems.  Configuration specifies a list of memory
> > > +    regions, which then are assigned to devices.  Memory regions can
> > > +    be shared among many device drivers or assigned exclusively to
> > > +    one.  This has been achieved in the following ways:
> > 
> > OK, I like the idea of regions, i.e. defining memory areas with specific
> > properties or uses.
> > 
> > But why should it be possible to define regions through kernel parameters?
> > Regions are typically fixed for a particular platform and can be setup in
> > the
> > platform specific code. Actually, one region could be setup by default:
> > DMA-able memory. That would be very handy in fact for many PCI-based TV
> > capture drivers.
> 
> IMHO this is a just desktop-point-of-view. In embedded world things are
> a bit different. Most SoCs have a some kind of common system memory and
> usually all build-in peripherals are able to DMA to any part of it (there is
> no DMA specific hardware zone).

There are still desktops in the world, you know :-) It's nice if cma can also
be used there. Having a region called 'dma' or something like that would be
useful for drivers. And yes, for most embedded systems that would be the
equivalent of common memory.
 
> > I think that the only thing that you want to set in the kernel params is
> > the size of each region.
> 
> Keeping it as a kernel parameter is very handy for development. But I agree
> that we might make it dependent on some Kconfig entry. This way a platform
> setup code would provide default region description just as an array of the
> region structures and we will get rid of the parsing code in the release
> versions.

I think that would help, yes.
  
> > The same with assigning regions to drivers: why would you want to do that?
> > The driver should know which regions it can use (with possible fallbacks).
> 
> I'm sorry, but this is again a little 'desktop-centric point-of-view'. On
> desktop it is perfectly acceptable to have a separate memory region for each
> device. In embedded world memory is a precious resource. Of course we can go
> the 'separate memory region for each device' way, but we observed that at
> least some memory can be recovered if we decide to share memory regions for
> some of the devices.
> 
> Assigning regions to the drivers is a way to describe how memory can be
> shared. This is something that is independent from the actual drivers. 
> Device drivers cannot and mustn't have such knowledge. 

You misunderstand me. I wasn't promoting separate memory regions for each device.
Quite the opposite: initially I would make regions shared by default. I actually
do not see why device drivers can't have the knowledge to allocate their memory
from particular regions. We are talking about embedded systems where these types
of memory allocations tend to be well defined. E.g. the video capture driver needs
to allocate X video buffers at boot time. For most devices that will be the dma
region, in the case of samsung devices that will be regions for particular memory
banks.

The only time reserving regions for specific devices might be necessary is if
you can get memory fragmentation if multiple drivers use the same region. But if
drivers just allocate the memory at boot, then shared regions work fine.

> 
> > And it can know that provided regions are setup by the platform code and
> > not created dynamically. This will simplify things enormously.
> > 
> > > +    1. The core of the CMA does not handle allocation of memory and
> > > +       management of free space.  Dedicated allocators are used for
> > > +       that purpose.
> > > +
> > > +       This way, if the provided solution does not match demands
> > > +       imposed on a given system, one can develop a new algorithm and
> > > +       easily plug it into the CMA framework.
> > > +
> > > +       The presented solution includes an implementation of a best-fit
> > > +       algorithm.
> > 
> > Again, do we really need user-settable per-region allocators? Just provide
> > one with the option to later choose others through the kernel Kconfig files.
> 
> From our experience, yes. Different allocators can cope with different memory
> usage scenarios better or worse. This results in higher or lower memory
> fragmentation. System use cases are something that kernel or drivers are
> definitely not aware, so only user space can tune this parameter to get the
> best possible system behavior.

Can you describe some of those usage scenarios? The cases I am familiar with
are the standard: 'need to allocate X buffers of Y MB memory each on boot' and
these buffers are only freed when the module is unloaded. While you can get
fragmentation in that case if you creatively load and unload modules, this is
not the normal usage in a deployed system.

We already have several 'normal' memory allocators (slab, slub, slob, sleb,
whatever. I've lost track there.) and I'm not very enthusiastic about the
idea of adding another bunch of them. Especially not in a first release.

Keep it simple. Just fix the core problem first, even if you are not yet able
to squeeze the last byte of memory from your system. Later you can always add
features as needed and optimize it further.
 
> > We can always add more complex scenarios later, but for an initial version
> > I'd keep it simple.
> > 
> > > +
> > > +    2. CMA allows a run-time configuration of the memory regions it
> > > +       will use to allocate chunks of memory from.  The set of memory
> > > +       regions is given on command line so it can be easily changed
> > > +       without the need for recompiling the kernel.
> > > +
> > > +       Each region has it's own size, alignment demand, a start
> > > +       address (physical address where it should be placed) and an
> > > +       allocator algorithm assigned to the region.
> > > +
> > > +       This means that there can be different algorithms running at
> > > +       the same time, if different devices on the platform have
> > > +       distinct memory usage characteristics and different algorithm
> > > +       match those the best way.
> > 
> > Seems overengineering to me. Just ensure that the code can be extended
> > later to such hypothetical scenarios. They are hypothetical, right?
> 
> Not really. Having the possibility to reconfigure memory configuration
> without kernel recompilation is very handy when one is tuning the
> configuration for the specific use case.
> 
> > > +    3. When requesting memory, devices have to introduce themselves.
> > > +       This way CMA knows who the memory is allocated for.  This
> > > +       allows for the system architect to specify which memory regions
> > > +       each device should use.
> > > +
> > > +       3a. Devices can also specify a "kind" of memory they want.
> > > +           This makes it possible to configure the system in such
> > > +           a way, that a single device may get memory from different
> > > +           memory regions, depending on the "kind" of memory it
> > > +           requested.  For example, a video codec driver might want to
> > > +           allocate some shared buffers from the first memory bank and
> > > +           the other from the second to get the highest possible
> > > +           memory throughput.
> > 
> > Not sure I understand this. Isn't this just two regions, one for each
> > memory bank,
> > and the driver requests some buffers from one region and some from the
> > other?
> 
> Right.
> 
> > Not sure how a 'kind of memory' features in this.
> 
> This 'kind' is a just cookie or a label used by the driver to distinguish
> requests for both memory banks. This functionality is essential for our
> hardware (just for hardware video codec we have 3 'kinds' of memory: memory
> bank A, memory bank B and special region for the firmware).

Let me see if I understand: driver D needs memory from bank A, so it specifies
e.g. label "a" when requesting memory. And the kernel parameters are then supposed
to define a region for memory bank A (say region "bank-a" and tell the cma that
requests from driver D for memory of kind "a" should go to region "bank-a".

Of course, the driver can also just request memory from the platform-defined
region "bank-a" directly rather than having to rely on userspace provided
parameters.
 
> > > +    4. For greater flexibility and extensibility, the framework allows
> > > +       device drivers to register private regions of reserved memory
> > > +       which then may be used only by them.
> > > +
> > > +       As an effect, if a driver would not use the rest of the CMA
> > > +       interface, it can still use CMA allocators and other
> > > +       mechanisms.
> > 
> > Why would you? Is there an actual driver that will need this?
> 
> This feature has been added after posting v1 of this rfc/patch. Jonathan 
> Corbet suggested in <http://article.gmane.org/gmane.linux.kernel.mm/50689> 
> that viafb driver might register its own private memory and use cma just
> as an allocator. IMHO this is a good idea, this way we might remove a bunch
> of custom allocators from the drivers (yes, there are such all over the
> kernel).

It goes back to whether we want per-device (or per-driver) regions or just
have global regions. What I have seen in practice is that these drivers just
need X amount of contiguous memory on boot. Having just a single region (as
it will be for most systems) to carve the buffers from is just as efficient
if not more than creating separate regions for each driver. Only if you
start freeing and reallocating memory later on will you get into trouble.

But if you do that, then you are trying to duplicate the behavior of the
normal allocators in my opinion. I really don't think we want to go there.

> 
> > > +       4a. Early in boot process, device drivers can also request the
> > > +           CMA framework to a reserve a region of memory for them
> > > +           which then will be used as a private region.
> > > +
> > > +           This way, drivers do not need to directly call bootmem,
> > > +           memblock or similar early allocator but merely register an
> > > +           early region and the framework will handle the rest
> > > +           including choosing the right early allocator.
> > 
> > The whole concept of private regions seems unnecessary to me.
> > 
> > <big snip>
> > 
> > It looks to me as if you tried to think of all possible hypothetical
> > situations
> > and write a framework for that. Of course, you may know more than I do, and
> > some of these situations actually happen.
> 
> Not exactly. We tried to design a solution that would cover all requirements
> for OUR (quite specific) embedded hardware. However we didn't want to tie it
> only to our platform. We just generalized most of our requirements so they can
> be reused for other systems.
> 
> > The basic design ideas are solid, I think. But you should get rid of all
> > the fancy features and go back to basics. We can always add those features
> > later should that become necessary. But removing features is much, much
> harder.
> 
> Well, please keep in mind that we cannot remove features that are essential
> for our solution. We know that a simple framework have some advantages
> (well, the most important one is the fact that it is easy to understand),
> but making it too simple would render it useless from our point of view
> (if it would not provide functionality required by our drivers and hardware).

Why not give a list of the requirements that your hardware has? The only
requirement unique to your hardware that I am aware of is the need for buffers
in specific memory banks (and apparently a special memory area for firmware).

So besides a 'dma' region (which is probably sufficient for most systems)
you would also have a 'banka' and 'bankb' region and perhaps a 'fw' region.

In the kernel parameters the user can specify the sizes for these regions
that are reserved at boot time and drivers will use cma to get their memory
from the relevant region using a simple allocator.

If there are other requirements in your hardware (or other hardware) that
should be taken into account for an initial version of the cma, then I'm
interested in hearing about it.

One note: since the first version of the patch wasn't posted to linux-media
I didn't follow the replies to that. If there are things in that thread that
I should read, then just mail me some links.

Regards,

	Hans

-- 
Hans Verkuil - video4linux developer - sponsored by TANDBERG, part of Cisco

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-27 16:27           ` Hans Verkuil
  0 siblings, 0 replies; 57+ messages in thread
From: Hans Verkuil @ 2010-07-27 16:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday 27 July 2010 09:41:40 Marek Szyprowski wrote:
> Hello,
> 
> On Monday, July 26, 2010 10:29 PM Hans Verkuil wrote:
> 
> > Hi Michal,
> > 
> > Thanks for working on this, we definitely need something along these lines.
> > 
> > On Monday 26 July 2010 16:40:30 Michal Nazarewicz wrote:
> > > The Contiguous Memory Allocator framework is a set of APIs for
> > > allocating physically contiguous chunks of memory.
> > >
> > > Various chips require contiguous blocks of memory to operate.  Those
> > > chips include devices such as cameras, hardware video decoders and
> > > encoders, etc.
> > >
> > > The code is highly modular and customisable to suit the needs of
> > > various users.  Set of regions reserved for CMA can be configured on
> > > run-time and it is easy to add custom allocator algorithms if one
> > > has such need.
> > >
> > > For more details see Documentation/contiguous-memory.txt.
> > >
> > > Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
> > > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> > > Reviewed-by: Pawel Osciak <p.osciak@samsung.com>
> > > ---
> > >  Documentation/00-INDEX                             |    2 +
> > >  .../ABI/testing/sysfs-kernel-mm-contiguous         |    9 +
> > >  Documentation/contiguous-memory.txt                |  646 +++++++++++
> > >  Documentation/kernel-parameters.txt                |    4 +
> > >  include/linux/cma.h                                |  445 ++++++++
> > >  mm/Kconfig                                         |   34 +
> > >  mm/Makefile                                        |    3 +
> > >  mm/cma-best-fit.c                                  |  407 +++++++
> > >  mm/cma.c                                           | 1170
> > ++++++++++++++++++++
> > >  9 files changed, 2720 insertions(+), 0 deletions(-)
> > >  create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > >  create mode 100644 Documentation/contiguous-memory.txt
> > >  create mode 100644 include/linux/cma.h
> > >  create mode 100644 mm/cma-best-fit.c
> > >  create mode 100644 mm/cma.c
> > >
> > > diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
> > > index 5405f7a..bb50209 100644
> > > --- a/Documentation/00-INDEX
> > > +++ b/Documentation/00-INDEX
> > > @@ -94,6 +94,8 @@ connector/
> > >  	- docs on the netlink based userspace<->kernel space communication
> > mod.
> > >  console/
> > >  	- documentation on Linux console drivers.
> > > +contiguous-memory.txt
> > > +	- documentation on physically-contiguous memory allocation framework.
> > >  cpu-freq/
> > >  	- info on CPU frequency and voltage scaling.
> > >  cpu-hotplug.txt
> > > diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > > new file mode 100644
> > > index 0000000..05e2f6a
> > > --- /dev/null
> > > +++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > > @@ -0,0 +1,9 @@
> > > +What:		/sys/kernel/mm/contiguous/
> > > +Date:		July 2008
> > > +Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
> > > +Description:
> > > +		/sys/kernel/mm/contiguous/ contains two files: asterisk and
> > > +		map.  They are used to configure the Contiguous Memory
> > > +		Allocator framework.
> > > +
> > > +		For details see Documentation/contiguous-memory.txt.
> > > diff --git a/Documentation/contiguous-memory.txt
> > b/Documentation/contiguous-memory.txt
> > > new file mode 100644
> > > index 0000000..6eb1295
> > > --- /dev/null
> > > +++ b/Documentation/contiguous-memory.txt
> > > @@ -0,0 +1,646 @@
> > > +                                                             -*- org -*-
> > > +
> > > +* Contiguous Memory Allocator
> > > +
> > > +   The Contiguous Memory Allocator (CMA) is a framework, which allows
> > > +   setting up a machine-specific configuration for physically-contiguous
> > > +   memory management. Memory for devices is then allocated according
> > > +   to that configuration.
> > > +
> > > +   The main role of the framework is not to allocate memory, but to
> > > +   parse and manage memory configurations, as well as to act as an
> > > +   in-between between device drivers and pluggable allocators. It is
> > > +   thus not tied to any memory allocation method or strategy.
> > > +
> > > +** Why is it needed?
> > > +
> > > +    Various devices on embedded systems have no scatter-getter and/or
> > > +    IO map support and as such require contiguous blocks of memory to
> > > +    operate.  They include devices such as cameras, hardware video
> > > +    decoders and encoders, etc.
> > > +
> > > +    Such devices often require big memory buffers (a full HD frame is,
> > > +    for instance, more then 2 mega pixels large, i.e. more than 6 MB
> > > +    of memory), which makes mechanisms such as kmalloc() ineffective.
> > > +
> > > +    Some embedded devices impose additional requirements on the
> > > +    buffers, e.g. they can operate only on buffers allocated in
> > > +    particular location/memory bank (if system has more than one
> > > +    memory bank) or buffers aligned to a particular memory boundary.
> > > +
> > > +    Development of embedded devices have seen a big rise recently
> > > +    (especially in the V4L area) and many such drivers include their
> > > +    own memory allocation code. Most of them use bootmem-based methods.
> > > +    CMA framework is an attempt to unify contiguous memory allocation
> > > +    mechanisms and provide a simple API for device drivers, while
> > > +    staying as customisable and modular as possible.
> > > +
> > > +** Design
> > > +
> > > +    The main design goal for the CMA was to provide a customisable and
> > > +    modular framework, which could be configured to suit the needs of
> > > +    individual systems.  Configuration specifies a list of memory
> > > +    regions, which then are assigned to devices.  Memory regions can
> > > +    be shared among many device drivers or assigned exclusively to
> > > +    one.  This has been achieved in the following ways:
> > 
> > OK, I like the idea of regions, i.e. defining memory areas with specific
> > properties or uses.
> > 
> > But why should it be possible to define regions through kernel parameters?
> > Regions are typically fixed for a particular platform and can be setup in
> > the
> > platform specific code. Actually, one region could be setup by default:
> > DMA-able memory. That would be very handy in fact for many PCI-based TV
> > capture drivers.
> 
> IMHO this is a just desktop-point-of-view. In embedded world things are
> a bit different. Most SoCs have a some kind of common system memory and
> usually all build-in peripherals are able to DMA to any part of it (there is
> no DMA specific hardware zone).

There are still desktops in the world, you know :-) It's nice if cma can also
be used there. Having a region called 'dma' or something like that would be
useful for drivers. And yes, for most embedded systems that would be the
equivalent of common memory.
 
> > I think that the only thing that you want to set in the kernel params is
> > the size of each region.
> 
> Keeping it as a kernel parameter is very handy for development. But I agree
> that we might make it dependent on some Kconfig entry. This way a platform
> setup code would provide default region description just as an array of the
> region structures and we will get rid of the parsing code in the release
> versions.

I think that would help, yes.
  
> > The same with assigning regions to drivers: why would you want to do that?
> > The driver should know which regions it can use (with possible fallbacks).
> 
> I'm sorry, but this is again a little 'desktop-centric point-of-view'. On
> desktop it is perfectly acceptable to have a separate memory region for each
> device. In embedded world memory is a precious resource. Of course we can go
> the 'separate memory region for each device' way, but we observed that at
> least some memory can be recovered if we decide to share memory regions for
> some of the devices.
> 
> Assigning regions to the drivers is a way to describe how memory can be
> shared. This is something that is independent from the actual drivers. 
> Device drivers cannot and mustn't have such knowledge. 

You misunderstand me. I wasn't promoting separate memory regions for each device.
Quite the opposite: initially I would make regions shared by default. I actually
do not see why device drivers can't have the knowledge to allocate their memory
from particular regions. We are talking about embedded systems where these types
of memory allocations tend to be well defined. E.g. the video capture driver needs
to allocate X video buffers at boot time. For most devices that will be the dma
region, in the case of samsung devices that will be regions for particular memory
banks.

The only time reserving regions for specific devices might be necessary is if
you can get memory fragmentation if multiple drivers use the same region. But if
drivers just allocate the memory at boot, then shared regions work fine.

> 
> > And it can know that provided regions are setup by the platform code and
> > not created dynamically. This will simplify things enormously.
> > 
> > > +    1. The core of the CMA does not handle allocation of memory and
> > > +       management of free space.  Dedicated allocators are used for
> > > +       that purpose.
> > > +
> > > +       This way, if the provided solution does not match demands
> > > +       imposed on a given system, one can develop a new algorithm and
> > > +       easily plug it into the CMA framework.
> > > +
> > > +       The presented solution includes an implementation of a best-fit
> > > +       algorithm.
> > 
> > Again, do we really need user-settable per-region allocators? Just provide
> > one with the option to later choose others through the kernel Kconfig files.
> 
> From our experience, yes. Different allocators can cope with different memory
> usage scenarios better or worse. This results in higher or lower memory
> fragmentation. System use cases are something that kernel or drivers are
> definitely not aware, so only user space can tune this parameter to get the
> best possible system behavior.

Can you describe some of those usage scenarios? The cases I am familiar with
are the standard: 'need to allocate X buffers of Y MB memory each on boot' and
these buffers are only freed when the module is unloaded. While you can get
fragmentation in that case if you creatively load and unload modules, this is
not the normal usage in a deployed system.

We already have several 'normal' memory allocators (slab, slub, slob, sleb,
whatever. I've lost track there.) and I'm not very enthusiastic about the
idea of adding another bunch of them. Especially not in a first release.

Keep it simple. Just fix the core problem first, even if you are not yet able
to squeeze the last byte of memory from your system. Later you can always add
features as needed and optimize it further.
 
> > We can always add more complex scenarios later, but for an initial version
> > I'd keep it simple.
> > 
> > > +
> > > +    2. CMA allows a run-time configuration of the memory regions it
> > > +       will use to allocate chunks of memory from.  The set of memory
> > > +       regions is given on command line so it can be easily changed
> > > +       without the need for recompiling the kernel.
> > > +
> > > +       Each region has it's own size, alignment demand, a start
> > > +       address (physical address where it should be placed) and an
> > > +       allocator algorithm assigned to the region.
> > > +
> > > +       This means that there can be different algorithms running at
> > > +       the same time, if different devices on the platform have
> > > +       distinct memory usage characteristics and different algorithm
> > > +       match those the best way.
> > 
> > Seems overengineering to me. Just ensure that the code can be extended
> > later to such hypothetical scenarios. They are hypothetical, right?
> 
> Not really. Having the possibility to reconfigure memory configuration
> without kernel recompilation is very handy when one is tuning the
> configuration for the specific use case.
> 
> > > +    3. When requesting memory, devices have to introduce themselves.
> > > +       This way CMA knows who the memory is allocated for.  This
> > > +       allows for the system architect to specify which memory regions
> > > +       each device should use.
> > > +
> > > +       3a. Devices can also specify a "kind" of memory they want.
> > > +           This makes it possible to configure the system in such
> > > +           a way, that a single device may get memory from different
> > > +           memory regions, depending on the "kind" of memory it
> > > +           requested.  For example, a video codec driver might want to
> > > +           allocate some shared buffers from the first memory bank and
> > > +           the other from the second to get the highest possible
> > > +           memory throughput.
> > 
> > Not sure I understand this. Isn't this just two regions, one for each
> > memory bank,
> > and the driver requests some buffers from one region and some from the
> > other?
> 
> Right.
> 
> > Not sure how a 'kind of memory' features in this.
> 
> This 'kind' is a just cookie or a label used by the driver to distinguish
> requests for both memory banks. This functionality is essential for our
> hardware (just for hardware video codec we have 3 'kinds' of memory: memory
> bank A, memory bank B and special region for the firmware).

Let me see if I understand: driver D needs memory from bank A, so it specifies
e.g. label "a" when requesting memory. And the kernel parameters are then supposed
to define a region for memory bank A (say region "bank-a" and tell the cma that
requests from driver D for memory of kind "a" should go to region "bank-a".

Of course, the driver can also just request memory from the platform-defined
region "bank-a" directly rather than having to rely on userspace provided
parameters.
 
> > > +    4. For greater flexibility and extensibility, the framework allows
> > > +       device drivers to register private regions of reserved memory
> > > +       which then may be used only by them.
> > > +
> > > +       As an effect, if a driver would not use the rest of the CMA
> > > +       interface, it can still use CMA allocators and other
> > > +       mechanisms.
> > 
> > Why would you? Is there an actual driver that will need this?
> 
> This feature has been added after posting v1 of this rfc/patch. Jonathan 
> Corbet suggested in <http://article.gmane.org/gmane.linux.kernel.mm/50689> 
> that viafb driver might register its own private memory and use cma just
> as an allocator. IMHO this is a good idea, this way we might remove a bunch
> of custom allocators from the drivers (yes, there are such all over the
> kernel).

It goes back to whether we want per-device (or per-driver) regions or just
have global regions. What I have seen in practice is that these drivers just
need X amount of contiguous memory on boot. Having just a single region (as
it will be for most systems) to carve the buffers from is just as efficient
if not more than creating separate regions for each driver. Only if you
start freeing and reallocating memory later on will you get into trouble.

But if you do that, then you are trying to duplicate the behavior of the
normal allocators in my opinion. I really don't think we want to go there.

> 
> > > +       4a. Early in boot process, device drivers can also request the
> > > +           CMA framework to a reserve a region of memory for them
> > > +           which then will be used as a private region.
> > > +
> > > +           This way, drivers do not need to directly call bootmem,
> > > +           memblock or similar early allocator but merely register an
> > > +           early region and the framework will handle the rest
> > > +           including choosing the right early allocator.
> > 
> > The whole concept of private regions seems unnecessary to me.
> > 
> > <big snip>
> > 
> > It looks to me as if you tried to think of all possible hypothetical
> > situations
> > and write a framework for that. Of course, you may know more than I do, and
> > some of these situations actually happen.
> 
> Not exactly. We tried to design a solution that would cover all requirements
> for OUR (quite specific) embedded hardware. However we didn't want to tie it
> only to our platform. We just generalized most of our requirements so they can
> be reused for other systems.
> 
> > The basic design ideas are solid, I think. But you should get rid of all
> > the fancy features and go back to basics. We can always add those features
> > later should that become necessary. But removing features is much, much
> harder.
> 
> Well, please keep in mind that we cannot remove features that are essential
> for our solution. We know that a simple framework have some advantages
> (well, the most important one is the fact that it is easy to understand),
> but making it too simple would render it useless from our point of view
> (if it would not provide functionality required by our drivers and hardware).

Why not give a list of the requirements that your hardware has? The only
requirement unique to your hardware that I am aware of is the need for buffers
in specific memory banks (and apparently a special memory area for firmware).

So besides a 'dma' region (which is probably sufficient for most systems)
you would also have a 'banka' and 'bankb' region and perhaps a 'fw' region.

In the kernel parameters the user can specify the sizes for these regions
that are reserved at boot time and drivers will use cma to get their memory
from the relevant region using a simple allocator.

If there are other requirements in your hardware (or other hardware) that
should be taken into account for an initial version of the cma, then I'm
interested in hearing about it.

One note: since the first version of the patch wasn't posted to linux-media
I didn't follow the replies to that. If there are things in that thread that
I should read, then just mail me some links.

Regards,

	Hans

-- 
Hans Verkuil - video4linux developer - sponsored by TANDBERG, part of Cisco

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
  2010-07-27 12:08       ` Russell King - ARM Linux
  (?)
@ 2010-07-28  8:53         ` Michał Nazarewicz
  -1 siblings, 0 replies; 57+ messages in thread
From: Michał Nazarewicz @ 2010-07-28  8:53 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Daniel Walker, Jonathan Corbet, Pawel Osciak, Mark Brown,
	linux-kernel, Hiremath Vaibhav, FUJITA Tomonori, linux-mm,
	Kyungmin Park, Zach Pfeffer, Marek Szyprowski, linux-arm-kernel,
	linux-media

>> +static inline dma_addr_t __must_check
>> +cma_alloc(const struct device *dev, const char *kind,
>> +	  size_t size, dma_addr_t alignment)
>> +{
>> +	return dev ? -EINVAL : __cma_alloc(dev, kind, size, alignment);

On Tue, 27 Jul 2010 14:08:41 +0200, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:
> So I can't use this to allocate memory for anything but a NULL device?

>> +static inline int
>> +cma_info(struct cma_info *info, const struct device *dev, const char *kind)
>> +{
>> +	return dev ? -EINVAL : __cma_info(info, dev, kind);

> This won't return information for anything but a NULL device?

Obviously a bug.  Thanks for spotting that.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Michał "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-28  8:53         ` Michał Nazarewicz
  0 siblings, 0 replies; 57+ messages in thread
From: Michał Nazarewicz @ 2010-07-28  8:53 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Daniel Walker, Jonathan Corbet, Pawel Osciak, Mark Brown,
	linux-kernel, Hiremath Vaibhav, FUJITA Tomonori, linux-mm,
	Kyungmin Park, Zach Pfeffer, Marek Szyprowski, linux-arm-kernel,
	linux-media

>> +static inline dma_addr_t __must_check
>> +cma_alloc(const struct device *dev, const char *kind,
>> +	  size_t size, dma_addr_t alignment)
>> +{
>> +	return dev ? -EINVAL : __cma_alloc(dev, kind, size, alignment);

On Tue, 27 Jul 2010 14:08:41 +0200, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:
> So I can't use this to allocate memory for anything but a NULL device?

>> +static inline int
>> +cma_info(struct cma_info *info, const struct device *dev, const char *kind)
>> +{
>> +	return dev ? -EINVAL : __cma_info(info, dev, kind);

> This won't return information for anything but a NULL device?

Obviously a bug.  Thanks for spotting that.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Michał "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-28  8:53         ` Michał Nazarewicz
  0 siblings, 0 replies; 57+ messages in thread
From: Michał Nazarewicz @ 2010-07-28  8:53 UTC (permalink / raw)
  To: linux-arm-kernel

>> +static inline dma_addr_t __must_check
>> +cma_alloc(const struct device *dev, const char *kind,
>> +	  size_t size, dma_addr_t alignment)
>> +{
>> +	return dev ? -EINVAL : __cma_alloc(dev, kind, size, alignment);

On Tue, 27 Jul 2010 14:08:41 +0200, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:
> So I can't use this to allocate memory for anything but a NULL device?

>> +static inline int
>> +cma_info(struct cma_info *info, const struct device *dev, const char *kind)
>> +{
>> +	return dev ? -EINVAL : __cma_info(info, dev, kind);

> This won't return information for anything but a NULL device?

Obviously a bug.  Thanks for spotting that.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Micha? "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

^ permalink raw reply	[flat|nested] 57+ messages in thread

* RE: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
  2010-07-27 16:27           ` Hans Verkuil
  (?)
@ 2010-07-28  9:04             ` Marek Szyprowski
  -1 siblings, 0 replies; 57+ messages in thread
From: Marek Szyprowski @ 2010-07-28  9:04 UTC (permalink / raw)
  To: 'Hans Verkuil'
  Cc: Michal Nazarewicz, linux-mm, linux-kernel, linux-media,
	linux-arm-kernel, 'Hiremath Vaibhav',
	Pawel Osciak, 'Mark Brown', 'Daniel Walker',
	'Jonathan Corbet', 'FUJITA Tomonori',
	'Zach Pfeffer', 'Kyungmin Park'

Hello,

On Tuesday, July 27, 2010 6:27 PM Hans Verkuil wrote:

> On Tuesday 27 July 2010 09:41:40 Marek Szyprowski wrote:
> > On Monday, July 26, 2010 10:29 PM Hans Verkuil wrote:
> > > On Monday 26 July 2010 16:40:30 Michal Nazarewicz wrote:
> > > > The Contiguous Memory Allocator framework is a set of APIs for
> > > > allocating physically contiguous chunks of memory.
> > > >
> > > > Various chips require contiguous blocks of memory to operate.  Those
> > > > chips include devices such as cameras, hardware video decoders and
> > > > encoders, etc.
> > > >
> > > > The code is highly modular and customisable to suit the needs of
> > > > various users.  Set of regions reserved for CMA can be configured on
> > > > run-time and it is easy to add custom allocator algorithms if one
> > > > has such need.
> > > >
> > > > For more details see Documentation/contiguous-memory.txt.
> > > >
> > > > Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
> > > > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> > > > Reviewed-by: Pawel Osciak <p.osciak@samsung.com>
> > > > ---
> > > >  Documentation/00-INDEX                             |    2 +
> > > >  .../ABI/testing/sysfs-kernel-mm-contiguous         |    9 +
> > > >  Documentation/contiguous-memory.txt                |  646
> +++++++++++
> > > >  Documentation/kernel-parameters.txt                |    4 +
> > > >  include/linux/cma.h                                |  445 ++++++++
> > > >  mm/Kconfig                                         |   34 +
> > > >  mm/Makefile                                        |    3 +
> > > >  mm/cma-best-fit.c                                  |  407 +++++++
> > > >  mm/cma.c                                           | 1170
> > > ++++++++++++++++++++
> > > >  9 files changed, 2720 insertions(+), 0 deletions(-)
> > > >  create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-
> contiguous
> > > >  create mode 100644 Documentation/contiguous-memory.txt
> > > >  create mode 100644 include/linux/cma.h
> > > >  create mode 100644 mm/cma-best-fit.c
> > > >  create mode 100644 mm/cma.c
> > > >
> > > > diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
> > > > index 5405f7a..bb50209 100644
> > > > --- a/Documentation/00-INDEX
> > > > +++ b/Documentation/00-INDEX
> > > > @@ -94,6 +94,8 @@ connector/
> > > >  	- docs on the netlink based userspace<->kernel space
> communication
> > > mod.
> > > >  console/
> > > >  	- documentation on Linux console drivers.
> > > > +contiguous-memory.txt
> > > > +	- documentation on physically-contiguous memory allocation
> framework.
> > > >  cpu-freq/
> > > >  	- info on CPU frequency and voltage scaling.
> > > >  cpu-hotplug.txt
> > > > diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > > b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > > > new file mode 100644
> > > > index 0000000..05e2f6a
> > > > --- /dev/null
> > > > +++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > > > @@ -0,0 +1,9 @@
> > > > +What:		/sys/kernel/mm/contiguous/
> > > > +Date:		July 2008
> > > > +Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
> > > > +Description:
> > > > +		/sys/kernel/mm/contiguous/ contains two files: asterisk
> and
> > > > +		map.  They are used to configure the Contiguous Memory
> > > > +		Allocator framework.
> > > > +
> > > > +		For details see Documentation/contiguous-memory.txt.
> > > > diff --git a/Documentation/contiguous-memory.txt
> > > b/Documentation/contiguous-memory.txt
> > > > new file mode 100644
> > > > index 0000000..6eb1295
> > > > --- /dev/null
> > > > +++ b/Documentation/contiguous-memory.txt
> > > > @@ -0,0 +1,646 @@
> > > > +                                                             -*- org
> -*-
> > > > +
> > > > +* Contiguous Memory Allocator
> > > > +
> > > > +   The Contiguous Memory Allocator (CMA) is a framework, which
> allows
> > > > +   setting up a machine-specific configuration for physically-
> contiguous
> > > > +   memory management. Memory for devices is then allocated according
> > > > +   to that configuration.
> > > > +
> > > > +   The main role of the framework is not to allocate memory, but to
> > > > +   parse and manage memory configurations, as well as to act as an
> > > > +   in-between between device drivers and pluggable allocators. It is
> > > > +   thus not tied to any memory allocation method or strategy.
> > > > +
> > > > +** Why is it needed?
> > > > +
> > > > +    Various devices on embedded systems have no scatter-getter
> and/or
> > > > +    IO map support and as such require contiguous blocks of memory
> to
> > > > +    operate.  They include devices such as cameras, hardware video
> > > > +    decoders and encoders, etc.
> > > > +
> > > > +    Such devices often require big memory buffers (a full HD frame
> is,
> > > > +    for instance, more then 2 mega pixels large, i.e. more than 6 MB
> > > > +    of memory), which makes mechanisms such as kmalloc() ineffective.
> > > > +
> > > > +    Some embedded devices impose additional requirements on the
> > > > +    buffers, e.g. they can operate only on buffers allocated in
> > > > +    particular location/memory bank (if system has more than one
> > > > +    memory bank) or buffers aligned to a particular memory boundary.
> > > > +
> > > > +    Development of embedded devices have seen a big rise recently
> > > > +    (especially in the V4L area) and many such drivers include their
> > > > +    own memory allocation code. Most of them use bootmem-based
> methods.
> > > > +    CMA framework is an attempt to unify contiguous memory
> allocation
> > > > +    mechanisms and provide a simple API for device drivers, while
> > > > +    staying as customisable and modular as possible.
> > > > +
> > > > +** Design
> > > > +
> > > > +    The main design goal for the CMA was to provide a customisable
> and
> > > > +    modular framework, which could be configured to suit the needs
> of
> > > > +    individual systems.  Configuration specifies a list of memory
> > > > +    regions, which then are assigned to devices.  Memory regions can
> > > > +    be shared among many device drivers or assigned exclusively to
> > > > +    one.  This has been achieved in the following ways:
> > >
> > > OK, I like the idea of regions, i.e. defining memory areas with
> specific
> > > properties or uses.
> > >
> > > But why should it be possible to define regions through kernel
> parameters?
> > > Regions are typically fixed for a particular platform and can be setup
> in
> > > the
> > > platform specific code. Actually, one region could be setup by default:
> > > DMA-able memory. That would be very handy in fact for many PCI-based TV
> > > capture drivers.
> >
> > IMHO this is a just desktop-point-of-view. In embedded world things are
> > a bit different. Most SoCs have a some kind of common system memory and
> > usually all build-in peripherals are able to DMA to any part of it (there
> is
> > no DMA specific hardware zone).
> 
> There are still desktops in the world, you know :-) It's nice if cma can
> also be used there. Having a region called 'dma' or something like that
> would be useful for drivers. And yes, for most embedded systems that would
> be the equivalent of common memory.

I really see no problem using CMA on desktops, although in most cases only
a few drivers will be able to make use of it.

> > > I think that the only thing that you want to set in the kernel params
> is
> > > the size of each region.
> >
> > Keeping it as a kernel parameter is very handy for development. But I
> agree
> > that we might make it dependent on some Kconfig entry. This way a
> platform
> > setup code would provide default region description just as an array of
> the
> > region structures and we will get rid of the parsing code in the release
> > versions.
> 
> I think that would help, yes.
> 
> > > The same with assigning regions to drivers: why would you want to do
> that?
> > > The driver should know which regions it can use (with possible
> fallbacks).
> >
> > I'm sorry, but this is again a little 'desktop-centric point-of-view'. On
> > desktop it is perfectly acceptable to have a separate memory region for
> each
> > device. In embedded world memory is a precious resource. Of course we can
> go
> > the 'separate memory region for each device' way, but we observed that at
> > least some memory can be recovered if we decide to share memory regions
> for
> > some of the devices.
> >
> > Assigning regions to the drivers is a way to describe how memory can be
> > shared. This is something that is independent from the actual drivers.
> > Device drivers cannot and mustn't have such knowledge.
> 
> You misunderstand me. I wasn't promoting separate memory regions for each
> device.  Quite the opposite: initially I would make regions shared by default.
> I actually do not see why device drivers can't have the knowledge to allocate
> their memory from particular regions. We are talking about embedded systems 
> where these types of memory allocations tend to be well defined. E.g. the
> video capture driver needs to allocate X video buffers at boot time. For
> most devices that will be the dma region, in the case of samsung devices
> that will be regions for particular memory banks.

Let me introduce one more example. As you may know we have 3 video-processor
capture devices (Samsung FIMC) and a hardware codec (like Samsung MFC). FIMC
can capture video data from camera sensor and accelerate common video
processing tasks (like up/down scaling and color space conversion). Two FIMC
and MFC are require for things like HD video encoding or decoding with
online display/preview. This task require huge video buffers that are
usually allocated and freed at the same time. The third FIMC can be used for
gfx acceleration (color space conversion and scaling are quite common tasks
in GUI). This latter task usually introduces a lot of memory fragmentation,
as gfx surfaces are usually quite small (definitely smaller than HD frames
or 8MPix picture from camera). It would be really wise to get that third
FIMC device to use memory buffer that will be shared with 3D accelerator
(which has quite similar usage scenarios and suffers from similar memory
fragmentation).

> The only time reserving regions for specific devices might be necessary is
> if you can get memory fragmentation if multiple drivers use the same region.
> But if drivers just allocate the memory at boot, then shared regions work
> fine.
> >
> > > And it can know that provided regions are setup by the platform code
> and
> > > not created dynamically. This will simplify things enormously.
> > >
> > > > +    1. The core of the CMA does not handle allocation of memory and
> > > > +       management of free space.  Dedicated allocators are used for
> > > > +       that purpose.
> > > > +
> > > > +       This way, if the provided solution does not match demands
> > > > +       imposed on a given system, one can develop a new algorithm
> and
> > > > +       easily plug it into the CMA framework.
> > > > +
> > > > +       The presented solution includes an implementation of a best-
> fit
> > > > +       algorithm.
> > >
> > > Again, do we really need user-settable per-region allocators? Just
> provide
> > > one with the option to later choose others through the kernel Kconfig
> files.
> >
> > From our experience, yes. Different allocators can cope with different
> memory
> > usage scenarios better or worse. This results in higher or lower memory
> > fragmentation. System use cases are something that kernel or drivers are
> > definitely not aware, so only user space can tune this parameter to get
> the
> > best possible system behavior.
> 
> Can you describe some of those usage scenarios? The cases I am familiar
> with
> are the standard: 'need to allocate X buffers of Y MB memory each on boot'
> and these buffers are only freed when the module is unloaded. While you can
> get fragmentation in that case if you creatively load and unload modules,
> this is not the normal usage in a deployed system.

We don't want to allocate X buffers of Y MB memory each on boot. Instead we
want to just reserve XX MB memory and then dynamically allocate buffers from
it. This enables us to perform the following 2 tasks:
1. movie decoding in HD-quality (only one instance)
2. two instances of SD-quality movie decoding and SD-quality move encoding 
   (example: video conference)

We know that these two use cases are exclusive, so they can use the same
reserved memory.

> We already have several 'normal' memory allocators (slab, slub, slob, sleb,
> whatever. I've lost track there.) and I'm not very enthusiastic about the
> idea of adding another bunch of them. Especially not in a first release.
>
> Keep it simple. Just fix the core problem first, even if you are not yet
> able to squeeze the last byte of memory from your system. Later you can 
> always add features as needed and optimize it further.

We are trying to keep it as simple as possible :)
 
> > > We can always add more complex scenarios later, but for an initial
> version
> > > I'd keep it simple.
> > >
> > > > +
> > > > +    2. CMA allows a run-time configuration of the memory regions it
> > > > +       will use to allocate chunks of memory from.  The set of
> memory
> > > > +       regions is given on command line so it can be easily changed
> > > > +       without the need for recompiling the kernel.
> > > > +
> > > > +       Each region has it's own size, alignment demand, a start
> > > > +       address (physical address where it should be placed) and an
> > > > +       allocator algorithm assigned to the region.
> > > > +
> > > > +       This means that there can be different algorithms running at
> > > > +       the same time, if different devices on the platform have
> > > > +       distinct memory usage characteristics and different algorithm
> > > > +       match those the best way.
> > >
> > > Seems overengineering to me. Just ensure that the code can be extended
> > > later to such hypothetical scenarios. They are hypothetical, right?
> >
> > Not really. Having the possibility to reconfigure memory configuration
> > without kernel recompilation is very handy when one is tuning the
> > configuration for the specific use case.
> >
> > > > +    3. When requesting memory, devices have to introduce themselves.
> > > > +       This way CMA knows who the memory is allocated for.  This
> > > > +       allows for the system architect to specify which memory
> regions
> > > > +       each device should use.
> > > > +
> > > > +       3a. Devices can also specify a "kind" of memory they want.
> > > > +           This makes it possible to configure the system in such
> > > > +           a way, that a single device may get memory from different
> > > > +           memory regions, depending on the "kind" of memory it
> > > > +           requested.  For example, a video codec driver might want
> to
> > > > +           allocate some shared buffers from the first memory bank
> and
> > > > +           the other from the second to get the highest possible
> > > > +           memory throughput.
> > >
> > > Not sure I understand this. Isn't this just two regions, one for each
> > > memory bank,
> > > and the driver requests some buffers from one region and some from the
> > > other?
> >
> > Right.
> >
> > > Not sure how a 'kind of memory' features in this.
> >
> > This 'kind' is a just cookie or a label used by the driver to distinguish
> > requests for both memory banks. This functionality is essential for our
> > hardware (just for hardware video codec we have 3 'kinds' of memory:
> memory
> > bank A, memory bank B and special region for the firmware).
> 
> Let me see if I understand: driver D needs memory from bank A, so it
> specifies
> e.g. label "a" when requesting memory. And the kernel parameters are then
> supposed
> to define a region for memory bank A (say region "bank-a" and tell the cma
> that
> requests from driver D for memory of kind "a" should go to region "bank-a".
> 
> Of course, the driver can also just request memory from the platform-
> defined
> region "bank-a" directly rather than having to rely on userspace provided
> parameters.

Right. Regions provided by kernel command line are considered to be alternative
to the default configuration for tuning/debugging purposes.

> > > > +    4. For greater flexibility and extensibility, the framework
> allows
> > > > +       device drivers to register private regions of reserved memory
> > > > +       which then may be used only by them.
> > > > +
> > > > +       As an effect, if a driver would not use the rest of the CMA
> > > > +       interface, it can still use CMA allocators and other
> > > > +       mechanisms.
> > >
> > > Why would you? Is there an actual driver that will need this?
> >
> > This feature has been added after posting v1 of this rfc/patch. Jonathan
> > Corbet suggested in
> <http://article.gmane.org/gmane.linux.kernel.mm/50689>
> > that viafb driver might register its own private memory and use cma just
> > as an allocator. IMHO this is a good idea, this way we might remove a
> bunch
> > of custom allocators from the drivers (yes, there are such all over the
> > kernel).
> 
> It goes back to whether we want per-device (or per-driver) regions or just
> have global regions.

IMHO both types of regions can coexist and we can make use of their
advantages. I see no point in limiting CMA only to one type of the regions.

> What I have seen in practice is that these drivers just
> need X amount of contiguous memory on boot. Having just a single region (as
> it will be for most systems) to carve the buffers from is just as efficient
> if not more than creating separate regions for each driver. Only if you
> start freeing and reallocating memory later on will you get into trouble.

> But if you do that, then you are trying to duplicate the behavior of the
> normal allocators in my opinion. I really don't think we want to go there.

Well, then why some drivers like viafb or omapfb have custom memory
allocators? This case shows that there is still a need for another one.

> > > > +       4a. Early in boot process, device drivers can also request
> the
> > > > +           CMA framework to a reserve a region of memory for them
> > > > +           which then will be used as a private region.
> > > > +
> > > > +           This way, drivers do not need to directly call bootmem,
> > > > +           memblock or similar early allocator but merely register
> an
> > > > +           early region and the framework will handle the rest
> > > > +           including choosing the right early allocator.
> > >
> > > The whole concept of private regions seems unnecessary to me.
> > >
> > > <big snip>
> > >
> > > It looks to me as if you tried to think of all possible hypothetical
> > > situations
> > > and write a framework for that. Of course, you may know more than I do,
> > > and some of these situations actually happen.
> >
> > Not exactly. We tried to design a solution that would cover all
> requirements
> > for OUR (quite specific) embedded hardware. However we didn't want to tie
> it
> > only to our platform. We just generalized most of our requirements so
> they can
> > be reused for other systems.
> >
> > > The basic design ideas are solid, I think. But you should get rid of
> all
> > > the fancy features and go back to basics. We can always add those
> features
> > > later should that become necessary. But removing features is much, much
> > harder.
> >
> > Well, please keep in mind that we cannot remove features that are
> essential
> > for our solution. We know that a simple framework have some advantages
> > (well, the most important one is the fact that it is easy to understand),
> > but making it too simple would render it useless from our point of view
> > (if it would not provide functionality required by our drivers and
> hardware).
> 
> Why not give a list of the requirements that your hardware has? The only
> requirement unique to your hardware that I am aware of is the need for
> buffers in specific memory banks (and apparently a special memory area for
> firmware).

Yes, this is a real hardware requirement that cannot be dropped. Other
requirements come from the system design (software).

> So besides a 'dma' region (which is probably sufficient for most systems)
> you would also have a 'banka' and 'bankb' region and perhaps a 'fw' region.

Actually this DMA region is either just a 'banka' or 'bankb'. There is no
special requirements for the common dma memory and no need for the separate
'dma' region in our case. We just want to make it shared with at least camera
capture buffers.

> In the kernel parameters the user can specify the sizes for these regions
> that are reserved at boot time and drivers will use cma to get their memory
> from the relevant region using a simple allocator.
> 
> If there are other requirements in your hardware (or other hardware) that
> should be taken into account for an initial version of the cma, then I'm
> interested in hearing about it.

We will try to summarize our requirements again and repost them soon.
 
> One note: since the first version of the patch wasn't posted to linux-media
> I didn't follow the replies to that. If there are things in that thread
> that I should read, then just mail me some links.

I'm very sorry that the first version has been posted only to linux-mm.
The discussion was mainly about the way how the memory configuration and
device mapping should be provided (first version used only kernel parameters
for that). We should probably also CC a few more kernel subsystems that might
benefit from a common memory allocator, but we already were a bit scared by
the number of people and lists in this series. We really don't want to get
a 'spammer' label ;)

Best regards
--
Marek Szyprowski
Samsung Poland R&D Center




^ permalink raw reply	[flat|nested] 57+ messages in thread

* RE: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-28  9:04             ` Marek Szyprowski
  0 siblings, 0 replies; 57+ messages in thread
From: Marek Szyprowski @ 2010-07-28  9:04 UTC (permalink / raw)
  To: 'Hans Verkuil'
  Cc: Michal Nazarewicz, linux-mm, linux-kernel, linux-media,
	linux-arm-kernel, 'Hiremath Vaibhav',
	Pawel Osciak, 'Mark Brown', 'Daniel Walker',
	'Jonathan Corbet', 'FUJITA Tomonori',
	'Zach Pfeffer', 'Kyungmin Park'

Hello,

On Tuesday, July 27, 2010 6:27 PM Hans Verkuil wrote:

> On Tuesday 27 July 2010 09:41:40 Marek Szyprowski wrote:
> > On Monday, July 26, 2010 10:29 PM Hans Verkuil wrote:
> > > On Monday 26 July 2010 16:40:30 Michal Nazarewicz wrote:
> > > > The Contiguous Memory Allocator framework is a set of APIs for
> > > > allocating physically contiguous chunks of memory.
> > > >
> > > > Various chips require contiguous blocks of memory to operate.  Those
> > > > chips include devices such as cameras, hardware video decoders and
> > > > encoders, etc.
> > > >
> > > > The code is highly modular and customisable to suit the needs of
> > > > various users.  Set of regions reserved for CMA can be configured on
> > > > run-time and it is easy to add custom allocator algorithms if one
> > > > has such need.
> > > >
> > > > For more details see Documentation/contiguous-memory.txt.
> > > >
> > > > Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
> > > > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> > > > Reviewed-by: Pawel Osciak <p.osciak@samsung.com>
> > > > ---
> > > >  Documentation/00-INDEX                             |    2 +
> > > >  .../ABI/testing/sysfs-kernel-mm-contiguous         |    9 +
> > > >  Documentation/contiguous-memory.txt                |  646
> +++++++++++
> > > >  Documentation/kernel-parameters.txt                |    4 +
> > > >  include/linux/cma.h                                |  445 ++++++++
> > > >  mm/Kconfig                                         |   34 +
> > > >  mm/Makefile                                        |    3 +
> > > >  mm/cma-best-fit.c                                  |  407 +++++++
> > > >  mm/cma.c                                           | 1170
> > > ++++++++++++++++++++
> > > >  9 files changed, 2720 insertions(+), 0 deletions(-)
> > > >  create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-
> contiguous
> > > >  create mode 100644 Documentation/contiguous-memory.txt
> > > >  create mode 100644 include/linux/cma.h
> > > >  create mode 100644 mm/cma-best-fit.c
> > > >  create mode 100644 mm/cma.c
> > > >
> > > > diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
> > > > index 5405f7a..bb50209 100644
> > > > --- a/Documentation/00-INDEX
> > > > +++ b/Documentation/00-INDEX
> > > > @@ -94,6 +94,8 @@ connector/
> > > >  	- docs on the netlink based userspace<->kernel space
> communication
> > > mod.
> > > >  console/
> > > >  	- documentation on Linux console drivers.
> > > > +contiguous-memory.txt
> > > > +	- documentation on physically-contiguous memory allocation
> framework.
> > > >  cpu-freq/
> > > >  	- info on CPU frequency and voltage scaling.
> > > >  cpu-hotplug.txt
> > > > diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > > b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > > > new file mode 100644
> > > > index 0000000..05e2f6a
> > > > --- /dev/null
> > > > +++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > > > @@ -0,0 +1,9 @@
> > > > +What:		/sys/kernel/mm/contiguous/
> > > > +Date:		July 2008
> > > > +Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
> > > > +Description:
> > > > +		/sys/kernel/mm/contiguous/ contains two files: asterisk
> and
> > > > +		map.  They are used to configure the Contiguous Memory
> > > > +		Allocator framework.
> > > > +
> > > > +		For details see Documentation/contiguous-memory.txt.
> > > > diff --git a/Documentation/contiguous-memory.txt
> > > b/Documentation/contiguous-memory.txt
> > > > new file mode 100644
> > > > index 0000000..6eb1295
> > > > --- /dev/null
> > > > +++ b/Documentation/contiguous-memory.txt
> > > > @@ -0,0 +1,646 @@
> > > > +                                                             -*- org
> -*-
> > > > +
> > > > +* Contiguous Memory Allocator
> > > > +
> > > > +   The Contiguous Memory Allocator (CMA) is a framework, which
> allows
> > > > +   setting up a machine-specific configuration for physically-
> contiguous
> > > > +   memory management. Memory for devices is then allocated according
> > > > +   to that configuration.
> > > > +
> > > > +   The main role of the framework is not to allocate memory, but to
> > > > +   parse and manage memory configurations, as well as to act as an
> > > > +   in-between between device drivers and pluggable allocators. It is
> > > > +   thus not tied to any memory allocation method or strategy.
> > > > +
> > > > +** Why is it needed?
> > > > +
> > > > +    Various devices on embedded systems have no scatter-getter
> and/or
> > > > +    IO map support and as such require contiguous blocks of memory
> to
> > > > +    operate.  They include devices such as cameras, hardware video
> > > > +    decoders and encoders, etc.
> > > > +
> > > > +    Such devices often require big memory buffers (a full HD frame
> is,
> > > > +    for instance, more then 2 mega pixels large, i.e. more than 6 MB
> > > > +    of memory), which makes mechanisms such as kmalloc() ineffective.
> > > > +
> > > > +    Some embedded devices impose additional requirements on the
> > > > +    buffers, e.g. they can operate only on buffers allocated in
> > > > +    particular location/memory bank (if system has more than one
> > > > +    memory bank) or buffers aligned to a particular memory boundary.
> > > > +
> > > > +    Development of embedded devices have seen a big rise recently
> > > > +    (especially in the V4L area) and many such drivers include their
> > > > +    own memory allocation code. Most of them use bootmem-based
> methods.
> > > > +    CMA framework is an attempt to unify contiguous memory
> allocation
> > > > +    mechanisms and provide a simple API for device drivers, while
> > > > +    staying as customisable and modular as possible.
> > > > +
> > > > +** Design
> > > > +
> > > > +    The main design goal for the CMA was to provide a customisable
> and
> > > > +    modular framework, which could be configured to suit the needs
> of
> > > > +    individual systems.  Configuration specifies a list of memory
> > > > +    regions, which then are assigned to devices.  Memory regions can
> > > > +    be shared among many device drivers or assigned exclusively to
> > > > +    one.  This has been achieved in the following ways:
> > >
> > > OK, I like the idea of regions, i.e. defining memory areas with
> specific
> > > properties or uses.
> > >
> > > But why should it be possible to define regions through kernel
> parameters?
> > > Regions are typically fixed for a particular platform and can be setup
> in
> > > the
> > > platform specific code. Actually, one region could be setup by default:
> > > DMA-able memory. That would be very handy in fact for many PCI-based TV
> > > capture drivers.
> >
> > IMHO this is a just desktop-point-of-view. In embedded world things are
> > a bit different. Most SoCs have a some kind of common system memory and
> > usually all build-in peripherals are able to DMA to any part of it (there
> is
> > no DMA specific hardware zone).
> 
> There are still desktops in the world, you know :-) It's nice if cma can
> also be used there. Having a region called 'dma' or something like that
> would be useful for drivers. And yes, for most embedded systems that would
> be the equivalent of common memory.

I really see no problem using CMA on desktops, although in most cases only
a few drivers will be able to make use of it.

> > > I think that the only thing that you want to set in the kernel params
> is
> > > the size of each region.
> >
> > Keeping it as a kernel parameter is very handy for development. But I
> agree
> > that we might make it dependent on some Kconfig entry. This way a
> platform
> > setup code would provide default region description just as an array of
> the
> > region structures and we will get rid of the parsing code in the release
> > versions.
> 
> I think that would help, yes.
> 
> > > The same with assigning regions to drivers: why would you want to do
> that?
> > > The driver should know which regions it can use (with possible
> fallbacks).
> >
> > I'm sorry, but this is again a little 'desktop-centric point-of-view'. On
> > desktop it is perfectly acceptable to have a separate memory region for
> each
> > device. In embedded world memory is a precious resource. Of course we can
> go
> > the 'separate memory region for each device' way, but we observed that at
> > least some memory can be recovered if we decide to share memory regions
> for
> > some of the devices.
> >
> > Assigning regions to the drivers is a way to describe how memory can be
> > shared. This is something that is independent from the actual drivers.
> > Device drivers cannot and mustn't have such knowledge.
> 
> You misunderstand me. I wasn't promoting separate memory regions for each
> device.  Quite the opposite: initially I would make regions shared by default.
> I actually do not see why device drivers can't have the knowledge to allocate
> their memory from particular regions. We are talking about embedded systems 
> where these types of memory allocations tend to be well defined. E.g. the
> video capture driver needs to allocate X video buffers at boot time. For
> most devices that will be the dma region, in the case of samsung devices
> that will be regions for particular memory banks.

Let me introduce one more example. As you may know we have 3 video-processor
capture devices (Samsung FIMC) and a hardware codec (like Samsung MFC). FIMC
can capture video data from camera sensor and accelerate common video
processing tasks (like up/down scaling and color space conversion). Two FIMC
and MFC are require for things like HD video encoding or decoding with
online display/preview. This task require huge video buffers that are
usually allocated and freed at the same time. The third FIMC can be used for
gfx acceleration (color space conversion and scaling are quite common tasks
in GUI). This latter task usually introduces a lot of memory fragmentation,
as gfx surfaces are usually quite small (definitely smaller than HD frames
or 8MPix picture from camera). It would be really wise to get that third
FIMC device to use memory buffer that will be shared with 3D accelerator
(which has quite similar usage scenarios and suffers from similar memory
fragmentation).

> The only time reserving regions for specific devices might be necessary is
> if you can get memory fragmentation if multiple drivers use the same region.
> But if drivers just allocate the memory at boot, then shared regions work
> fine.
> >
> > > And it can know that provided regions are setup by the platform code
> and
> > > not created dynamically. This will simplify things enormously.
> > >
> > > > +    1. The core of the CMA does not handle allocation of memory and
> > > > +       management of free space.  Dedicated allocators are used for
> > > > +       that purpose.
> > > > +
> > > > +       This way, if the provided solution does not match demands
> > > > +       imposed on a given system, one can develop a new algorithm
> and
> > > > +       easily plug it into the CMA framework.
> > > > +
> > > > +       The presented solution includes an implementation of a best-
> fit
> > > > +       algorithm.
> > >
> > > Again, do we really need user-settable per-region allocators? Just
> provide
> > > one with the option to later choose others through the kernel Kconfig
> files.
> >
> > From our experience, yes. Different allocators can cope with different
> memory
> > usage scenarios better or worse. This results in higher or lower memory
> > fragmentation. System use cases are something that kernel or drivers are
> > definitely not aware, so only user space can tune this parameter to get
> the
> > best possible system behavior.
> 
> Can you describe some of those usage scenarios? The cases I am familiar
> with
> are the standard: 'need to allocate X buffers of Y MB memory each on boot'
> and these buffers are only freed when the module is unloaded. While you can
> get fragmentation in that case if you creatively load and unload modules,
> this is not the normal usage in a deployed system.

We don't want to allocate X buffers of Y MB memory each on boot. Instead we
want to just reserve XX MB memory and then dynamically allocate buffers from
it. This enables us to perform the following 2 tasks:
1. movie decoding in HD-quality (only one instance)
2. two instances of SD-quality movie decoding and SD-quality move encoding 
   (example: video conference)

We know that these two use cases are exclusive, so they can use the same
reserved memory.

> We already have several 'normal' memory allocators (slab, slub, slob, sleb,
> whatever. I've lost track there.) and I'm not very enthusiastic about the
> idea of adding another bunch of them. Especially not in a first release.
>
> Keep it simple. Just fix the core problem first, even if you are not yet
> able to squeeze the last byte of memory from your system. Later you can 
> always add features as needed and optimize it further.

We are trying to keep it as simple as possible :)
 
> > > We can always add more complex scenarios later, but for an initial
> version
> > > I'd keep it simple.
> > >
> > > > +
> > > > +    2. CMA allows a run-time configuration of the memory regions it
> > > > +       will use to allocate chunks of memory from.  The set of
> memory
> > > > +       regions is given on command line so it can be easily changed
> > > > +       without the need for recompiling the kernel.
> > > > +
> > > > +       Each region has it's own size, alignment demand, a start
> > > > +       address (physical address where it should be placed) and an
> > > > +       allocator algorithm assigned to the region.
> > > > +
> > > > +       This means that there can be different algorithms running at
> > > > +       the same time, if different devices on the platform have
> > > > +       distinct memory usage characteristics and different algorithm
> > > > +       match those the best way.
> > >
> > > Seems overengineering to me. Just ensure that the code can be extended
> > > later to such hypothetical scenarios. They are hypothetical, right?
> >
> > Not really. Having the possibility to reconfigure memory configuration
> > without kernel recompilation is very handy when one is tuning the
> > configuration for the specific use case.
> >
> > > > +    3. When requesting memory, devices have to introduce themselves.
> > > > +       This way CMA knows who the memory is allocated for.  This
> > > > +       allows for the system architect to specify which memory
> regions
> > > > +       each device should use.
> > > > +
> > > > +       3a. Devices can also specify a "kind" of memory they want.
> > > > +           This makes it possible to configure the system in such
> > > > +           a way, that a single device may get memory from different
> > > > +           memory regions, depending on the "kind" of memory it
> > > > +           requested.  For example, a video codec driver might want
> to
> > > > +           allocate some shared buffers from the first memory bank
> and
> > > > +           the other from the second to get the highest possible
> > > > +           memory throughput.
> > >
> > > Not sure I understand this. Isn't this just two regions, one for each
> > > memory bank,
> > > and the driver requests some buffers from one region and some from the
> > > other?
> >
> > Right.
> >
> > > Not sure how a 'kind of memory' features in this.
> >
> > This 'kind' is a just cookie or a label used by the driver to distinguish
> > requests for both memory banks. This functionality is essential for our
> > hardware (just for hardware video codec we have 3 'kinds' of memory:
> memory
> > bank A, memory bank B and special region for the firmware).
> 
> Let me see if I understand: driver D needs memory from bank A, so it
> specifies
> e.g. label "a" when requesting memory. And the kernel parameters are then
> supposed
> to define a region for memory bank A (say region "bank-a" and tell the cma
> that
> requests from driver D for memory of kind "a" should go to region "bank-a".
> 
> Of course, the driver can also just request memory from the platform-
> defined
> region "bank-a" directly rather than having to rely on userspace provided
> parameters.

Right. Regions provided by kernel command line are considered to be alternative
to the default configuration for tuning/debugging purposes.

> > > > +    4. For greater flexibility and extensibility, the framework
> allows
> > > > +       device drivers to register private regions of reserved memory
> > > > +       which then may be used only by them.
> > > > +
> > > > +       As an effect, if a driver would not use the rest of the CMA
> > > > +       interface, it can still use CMA allocators and other
> > > > +       mechanisms.
> > >
> > > Why would you? Is there an actual driver that will need this?
> >
> > This feature has been added after posting v1 of this rfc/patch. Jonathan
> > Corbet suggested in
> <http://article.gmane.org/gmane.linux.kernel.mm/50689>
> > that viafb driver might register its own private memory and use cma just
> > as an allocator. IMHO this is a good idea, this way we might remove a
> bunch
> > of custom allocators from the drivers (yes, there are such all over the
> > kernel).
> 
> It goes back to whether we want per-device (or per-driver) regions or just
> have global regions.

IMHO both types of regions can coexist and we can make use of their
advantages. I see no point in limiting CMA only to one type of the regions.

> What I have seen in practice is that these drivers just
> need X amount of contiguous memory on boot. Having just a single region (as
> it will be for most systems) to carve the buffers from is just as efficient
> if not more than creating separate regions for each driver. Only if you
> start freeing and reallocating memory later on will you get into trouble.

> But if you do that, then you are trying to duplicate the behavior of the
> normal allocators in my opinion. I really don't think we want to go there.

Well, then why some drivers like viafb or omapfb have custom memory
allocators? This case shows that there is still a need for another one.

> > > > +       4a. Early in boot process, device drivers can also request
> the
> > > > +           CMA framework to a reserve a region of memory for them
> > > > +           which then will be used as a private region.
> > > > +
> > > > +           This way, drivers do not need to directly call bootmem,
> > > > +           memblock or similar early allocator but merely register
> an
> > > > +           early region and the framework will handle the rest
> > > > +           including choosing the right early allocator.
> > >
> > > The whole concept of private regions seems unnecessary to me.
> > >
> > > <big snip>
> > >
> > > It looks to me as if you tried to think of all possible hypothetical
> > > situations
> > > and write a framework for that. Of course, you may know more than I do,
> > > and some of these situations actually happen.
> >
> > Not exactly. We tried to design a solution that would cover all
> requirements
> > for OUR (quite specific) embedded hardware. However we didn't want to tie
> it
> > only to our platform. We just generalized most of our requirements so
> they can
> > be reused for other systems.
> >
> > > The basic design ideas are solid, I think. But you should get rid of
> all
> > > the fancy features and go back to basics. We can always add those
> features
> > > later should that become necessary. But removing features is much, much
> > harder.
> >
> > Well, please keep in mind that we cannot remove features that are
> essential
> > for our solution. We know that a simple framework have some advantages
> > (well, the most important one is the fact that it is easy to understand),
> > but making it too simple would render it useless from our point of view
> > (if it would not provide functionality required by our drivers and
> hardware).
> 
> Why not give a list of the requirements that your hardware has? The only
> requirement unique to your hardware that I am aware of is the need for
> buffers in specific memory banks (and apparently a special memory area for
> firmware).

Yes, this is a real hardware requirement that cannot be dropped. Other
requirements come from the system design (software).

> So besides a 'dma' region (which is probably sufficient for most systems)
> you would also have a 'banka' and 'bankb' region and perhaps a 'fw' region.

Actually this DMA region is either just a 'banka' or 'bankb'. There is no
special requirements for the common dma memory and no need for the separate
'dma' region in our case. We just want to make it shared with at least camera
capture buffers.

> In the kernel parameters the user can specify the sizes for these regions
> that are reserved at boot time and drivers will use cma to get their memory
> from the relevant region using a simple allocator.
> 
> If there are other requirements in your hardware (or other hardware) that
> should be taken into account for an initial version of the cma, then I'm
> interested in hearing about it.

We will try to summarize our requirements again and repost them soon.
 
> One note: since the first version of the patch wasn't posted to linux-media
> I didn't follow the replies to that. If there are things in that thread
> that I should read, then just mail me some links.

I'm very sorry that the first version has been posted only to linux-mm.
The discussion was mainly about the way how the memory configuration and
device mapping should be provided (first version used only kernel parameters
for that). We should probably also CC a few more kernel subsystems that might
benefit from a common memory allocator, but we already were a bit scared by
the number of people and lists in this series. We really don't want to get
a 'spammer' label ;)

Best regards
--
Marek Szyprowski
Samsung Poland R&D Center



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-07-28  9:04             ` Marek Szyprowski
  0 siblings, 0 replies; 57+ messages in thread
From: Marek Szyprowski @ 2010-07-28  9:04 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Tuesday, July 27, 2010 6:27 PM Hans Verkuil wrote:

> On Tuesday 27 July 2010 09:41:40 Marek Szyprowski wrote:
> > On Monday, July 26, 2010 10:29 PM Hans Verkuil wrote:
> > > On Monday 26 July 2010 16:40:30 Michal Nazarewicz wrote:
> > > > The Contiguous Memory Allocator framework is a set of APIs for
> > > > allocating physically contiguous chunks of memory.
> > > >
> > > > Various chips require contiguous blocks of memory to operate.  Those
> > > > chips include devices such as cameras, hardware video decoders and
> > > > encoders, etc.
> > > >
> > > > The code is highly modular and customisable to suit the needs of
> > > > various users.  Set of regions reserved for CMA can be configured on
> > > > run-time and it is easy to add custom allocator algorithms if one
> > > > has such need.
> > > >
> > > > For more details see Documentation/contiguous-memory.txt.
> > > >
> > > > Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
> > > > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> > > > Reviewed-by: Pawel Osciak <p.osciak@samsung.com>
> > > > ---
> > > >  Documentation/00-INDEX                             |    2 +
> > > >  .../ABI/testing/sysfs-kernel-mm-contiguous         |    9 +
> > > >  Documentation/contiguous-memory.txt                |  646
> +++++++++++
> > > >  Documentation/kernel-parameters.txt                |    4 +
> > > >  include/linux/cma.h                                |  445 ++++++++
> > > >  mm/Kconfig                                         |   34 +
> > > >  mm/Makefile                                        |    3 +
> > > >  mm/cma-best-fit.c                                  |  407 +++++++
> > > >  mm/cma.c                                           | 1170
> > > ++++++++++++++++++++
> > > >  9 files changed, 2720 insertions(+), 0 deletions(-)
> > > >  create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-
> contiguous
> > > >  create mode 100644 Documentation/contiguous-memory.txt
> > > >  create mode 100644 include/linux/cma.h
> > > >  create mode 100644 mm/cma-best-fit.c
> > > >  create mode 100644 mm/cma.c
> > > >
> > > > diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX
> > > > index 5405f7a..bb50209 100644
> > > > --- a/Documentation/00-INDEX
> > > > +++ b/Documentation/00-INDEX
> > > > @@ -94,6 +94,8 @@ connector/
> > > >  	- docs on the netlink based userspace<->kernel space
> communication
> > > mod.
> > > >  console/
> > > >  	- documentation on Linux console drivers.
> > > > +contiguous-memory.txt
> > > > +	- documentation on physically-contiguous memory allocation
> framework.
> > > >  cpu-freq/
> > > >  	- info on CPU frequency and voltage scaling.
> > > >  cpu-hotplug.txt
> > > > diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > > b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > > > new file mode 100644
> > > > index 0000000..05e2f6a
> > > > --- /dev/null
> > > > +++ b/Documentation/ABI/testing/sysfs-kernel-mm-contiguous
> > > > @@ -0,0 +1,9 @@
> > > > +What:		/sys/kernel/mm/contiguous/
> > > > +Date:		July 2008
> > > > +Contact:	Michal Nazarewicz <m.nazarewicz@samsung.com>
> > > > +Description:
> > > > +		/sys/kernel/mm/contiguous/ contains two files: asterisk
> and
> > > > +		map.  They are used to configure the Contiguous Memory
> > > > +		Allocator framework.
> > > > +
> > > > +		For details see Documentation/contiguous-memory.txt.
> > > > diff --git a/Documentation/contiguous-memory.txt
> > > b/Documentation/contiguous-memory.txt
> > > > new file mode 100644
> > > > index 0000000..6eb1295
> > > > --- /dev/null
> > > > +++ b/Documentation/contiguous-memory.txt
> > > > @@ -0,0 +1,646 @@
> > > > +                                                             -*- org
> -*-
> > > > +
> > > > +* Contiguous Memory Allocator
> > > > +
> > > > +   The Contiguous Memory Allocator (CMA) is a framework, which
> allows
> > > > +   setting up a machine-specific configuration for physically-
> contiguous
> > > > +   memory management. Memory for devices is then allocated according
> > > > +   to that configuration.
> > > > +
> > > > +   The main role of the framework is not to allocate memory, but to
> > > > +   parse and manage memory configurations, as well as to act as an
> > > > +   in-between between device drivers and pluggable allocators. It is
> > > > +   thus not tied to any memory allocation method or strategy.
> > > > +
> > > > +** Why is it needed?
> > > > +
> > > > +    Various devices on embedded systems have no scatter-getter
> and/or
> > > > +    IO map support and as such require contiguous blocks of memory
> to
> > > > +    operate.  They include devices such as cameras, hardware video
> > > > +    decoders and encoders, etc.
> > > > +
> > > > +    Such devices often require big memory buffers (a full HD frame
> is,
> > > > +    for instance, more then 2 mega pixels large, i.e. more than 6 MB
> > > > +    of memory), which makes mechanisms such as kmalloc() ineffective.
> > > > +
> > > > +    Some embedded devices impose additional requirements on the
> > > > +    buffers, e.g. they can operate only on buffers allocated in
> > > > +    particular location/memory bank (if system has more than one
> > > > +    memory bank) or buffers aligned to a particular memory boundary.
> > > > +
> > > > +    Development of embedded devices have seen a big rise recently
> > > > +    (especially in the V4L area) and many such drivers include their
> > > > +    own memory allocation code. Most of them use bootmem-based
> methods.
> > > > +    CMA framework is an attempt to unify contiguous memory
> allocation
> > > > +    mechanisms and provide a simple API for device drivers, while
> > > > +    staying as customisable and modular as possible.
> > > > +
> > > > +** Design
> > > > +
> > > > +    The main design goal for the CMA was to provide a customisable
> and
> > > > +    modular framework, which could be configured to suit the needs
> of
> > > > +    individual systems.  Configuration specifies a list of memory
> > > > +    regions, which then are assigned to devices.  Memory regions can
> > > > +    be shared among many device drivers or assigned exclusively to
> > > > +    one.  This has been achieved in the following ways:
> > >
> > > OK, I like the idea of regions, i.e. defining memory areas with
> specific
> > > properties or uses.
> > >
> > > But why should it be possible to define regions through kernel
> parameters?
> > > Regions are typically fixed for a particular platform and can be setup
> in
> > > the
> > > platform specific code. Actually, one region could be setup by default:
> > > DMA-able memory. That would be very handy in fact for many PCI-based TV
> > > capture drivers.
> >
> > IMHO this is a just desktop-point-of-view. In embedded world things are
> > a bit different. Most SoCs have a some kind of common system memory and
> > usually all build-in peripherals are able to DMA to any part of it (there
> is
> > no DMA specific hardware zone).
> 
> There are still desktops in the world, you know :-) It's nice if cma can
> also be used there. Having a region called 'dma' or something like that
> would be useful for drivers. And yes, for most embedded systems that would
> be the equivalent of common memory.

I really see no problem using CMA on desktops, although in most cases only
a few drivers will be able to make use of it.

> > > I think that the only thing that you want to set in the kernel params
> is
> > > the size of each region.
> >
> > Keeping it as a kernel parameter is very handy for development. But I
> agree
> > that we might make it dependent on some Kconfig entry. This way a
> platform
> > setup code would provide default region description just as an array of
> the
> > region structures and we will get rid of the parsing code in the release
> > versions.
> 
> I think that would help, yes.
> 
> > > The same with assigning regions to drivers: why would you want to do
> that?
> > > The driver should know which regions it can use (with possible
> fallbacks).
> >
> > I'm sorry, but this is again a little 'desktop-centric point-of-view'. On
> > desktop it is perfectly acceptable to have a separate memory region for
> each
> > device. In embedded world memory is a precious resource. Of course we can
> go
> > the 'separate memory region for each device' way, but we observed that at
> > least some memory can be recovered if we decide to share memory regions
> for
> > some of the devices.
> >
> > Assigning regions to the drivers is a way to describe how memory can be
> > shared. This is something that is independent from the actual drivers.
> > Device drivers cannot and mustn't have such knowledge.
> 
> You misunderstand me. I wasn't promoting separate memory regions for each
> device.  Quite the opposite: initially I would make regions shared by default.
> I actually do not see why device drivers can't have the knowledge to allocate
> their memory from particular regions. We are talking about embedded systems 
> where these types of memory allocations tend to be well defined. E.g. the
> video capture driver needs to allocate X video buffers at boot time. For
> most devices that will be the dma region, in the case of samsung devices
> that will be regions for particular memory banks.

Let me introduce one more example. As you may know we have 3 video-processor
capture devices (Samsung FIMC) and a hardware codec (like Samsung MFC). FIMC
can capture video data from camera sensor and accelerate common video
processing tasks (like up/down scaling and color space conversion). Two FIMC
and MFC are require for things like HD video encoding or decoding with
online display/preview. This task require huge video buffers that are
usually allocated and freed at the same time. The third FIMC can be used for
gfx acceleration (color space conversion and scaling are quite common tasks
in GUI). This latter task usually introduces a lot of memory fragmentation,
as gfx surfaces are usually quite small (definitely smaller than HD frames
or 8MPix picture from camera). It would be really wise to get that third
FIMC device to use memory buffer that will be shared with 3D accelerator
(which has quite similar usage scenarios and suffers from similar memory
fragmentation).

> The only time reserving regions for specific devices might be necessary is
> if you can get memory fragmentation if multiple drivers use the same region.
> But if drivers just allocate the memory at boot, then shared regions work
> fine.
> >
> > > And it can know that provided regions are setup by the platform code
> and
> > > not created dynamically. This will simplify things enormously.
> > >
> > > > +    1. The core of the CMA does not handle allocation of memory and
> > > > +       management of free space.  Dedicated allocators are used for
> > > > +       that purpose.
> > > > +
> > > > +       This way, if the provided solution does not match demands
> > > > +       imposed on a given system, one can develop a new algorithm
> and
> > > > +       easily plug it into the CMA framework.
> > > > +
> > > > +       The presented solution includes an implementation of a best-
> fit
> > > > +       algorithm.
> > >
> > > Again, do we really need user-settable per-region allocators? Just
> provide
> > > one with the option to later choose others through the kernel Kconfig
> files.
> >
> > From our experience, yes. Different allocators can cope with different
> memory
> > usage scenarios better or worse. This results in higher or lower memory
> > fragmentation. System use cases are something that kernel or drivers are
> > definitely not aware, so only user space can tune this parameter to get
> the
> > best possible system behavior.
> 
> Can you describe some of those usage scenarios? The cases I am familiar
> with
> are the standard: 'need to allocate X buffers of Y MB memory each on boot'
> and these buffers are only freed when the module is unloaded. While you can
> get fragmentation in that case if you creatively load and unload modules,
> this is not the normal usage in a deployed system.

We don't want to allocate X buffers of Y MB memory each on boot. Instead we
want to just reserve XX MB memory and then dynamically allocate buffers from
it. This enables us to perform the following 2 tasks:
1. movie decoding in HD-quality (only one instance)
2. two instances of SD-quality movie decoding and SD-quality move encoding 
   (example: video conference)

We know that these two use cases are exclusive, so they can use the same
reserved memory.

> We already have several 'normal' memory allocators (slab, slub, slob, sleb,
> whatever. I've lost track there.) and I'm not very enthusiastic about the
> idea of adding another bunch of them. Especially not in a first release.
>
> Keep it simple. Just fix the core problem first, even if you are not yet
> able to squeeze the last byte of memory from your system. Later you can 
> always add features as needed and optimize it further.

We are trying to keep it as simple as possible :)
 
> > > We can always add more complex scenarios later, but for an initial
> version
> > > I'd keep it simple.
> > >
> > > > +
> > > > +    2. CMA allows a run-time configuration of the memory regions it
> > > > +       will use to allocate chunks of memory from.  The set of
> memory
> > > > +       regions is given on command line so it can be easily changed
> > > > +       without the need for recompiling the kernel.
> > > > +
> > > > +       Each region has it's own size, alignment demand, a start
> > > > +       address (physical address where it should be placed) and an
> > > > +       allocator algorithm assigned to the region.
> > > > +
> > > > +       This means that there can be different algorithms running at
> > > > +       the same time, if different devices on the platform have
> > > > +       distinct memory usage characteristics and different algorithm
> > > > +       match those the best way.
> > >
> > > Seems overengineering to me. Just ensure that the code can be extended
> > > later to such hypothetical scenarios. They are hypothetical, right?
> >
> > Not really. Having the possibility to reconfigure memory configuration
> > without kernel recompilation is very handy when one is tuning the
> > configuration for the specific use case.
> >
> > > > +    3. When requesting memory, devices have to introduce themselves.
> > > > +       This way CMA knows who the memory is allocated for.  This
> > > > +       allows for the system architect to specify which memory
> regions
> > > > +       each device should use.
> > > > +
> > > > +       3a. Devices can also specify a "kind" of memory they want.
> > > > +           This makes it possible to configure the system in such
> > > > +           a way, that a single device may get memory from different
> > > > +           memory regions, depending on the "kind" of memory it
> > > > +           requested.  For example, a video codec driver might want
> to
> > > > +           allocate some shared buffers from the first memory bank
> and
> > > > +           the other from the second to get the highest possible
> > > > +           memory throughput.
> > >
> > > Not sure I understand this. Isn't this just two regions, one for each
> > > memory bank,
> > > and the driver requests some buffers from one region and some from the
> > > other?
> >
> > Right.
> >
> > > Not sure how a 'kind of memory' features in this.
> >
> > This 'kind' is a just cookie or a label used by the driver to distinguish
> > requests for both memory banks. This functionality is essential for our
> > hardware (just for hardware video codec we have 3 'kinds' of memory:
> memory
> > bank A, memory bank B and special region for the firmware).
> 
> Let me see if I understand: driver D needs memory from bank A, so it
> specifies
> e.g. label "a" when requesting memory. And the kernel parameters are then
> supposed
> to define a region for memory bank A (say region "bank-a" and tell the cma
> that
> requests from driver D for memory of kind "a" should go to region "bank-a".
> 
> Of course, the driver can also just request memory from the platform-
> defined
> region "bank-a" directly rather than having to rely on userspace provided
> parameters.

Right. Regions provided by kernel command line are considered to be alternative
to the default configuration for tuning/debugging purposes.

> > > > +    4. For greater flexibility and extensibility, the framework
> allows
> > > > +       device drivers to register private regions of reserved memory
> > > > +       which then may be used only by them.
> > > > +
> > > > +       As an effect, if a driver would not use the rest of the CMA
> > > > +       interface, it can still use CMA allocators and other
> > > > +       mechanisms.
> > >
> > > Why would you? Is there an actual driver that will need this?
> >
> > This feature has been added after posting v1 of this rfc/patch. Jonathan
> > Corbet suggested in
> <http://article.gmane.org/gmane.linux.kernel.mm/50689>
> > that viafb driver might register its own private memory and use cma just
> > as an allocator. IMHO this is a good idea, this way we might remove a
> bunch
> > of custom allocators from the drivers (yes, there are such all over the
> > kernel).
> 
> It goes back to whether we want per-device (or per-driver) regions or just
> have global regions.

IMHO both types of regions can coexist and we can make use of their
advantages. I see no point in limiting CMA only to one type of the regions.

> What I have seen in practice is that these drivers just
> need X amount of contiguous memory on boot. Having just a single region (as
> it will be for most systems) to carve the buffers from is just as efficient
> if not more than creating separate regions for each driver. Only if you
> start freeing and reallocating memory later on will you get into trouble.

> But if you do that, then you are trying to duplicate the behavior of the
> normal allocators in my opinion. I really don't think we want to go there.

Well, then why some drivers like viafb or omapfb have custom memory
allocators? This case shows that there is still a need for another one.

> > > > +       4a. Early in boot process, device drivers can also request
> the
> > > > +           CMA framework to a reserve a region of memory for them
> > > > +           which then will be used as a private region.
> > > > +
> > > > +           This way, drivers do not need to directly call bootmem,
> > > > +           memblock or similar early allocator but merely register
> an
> > > > +           early region and the framework will handle the rest
> > > > +           including choosing the right early allocator.
> > >
> > > The whole concept of private regions seems unnecessary to me.
> > >
> > > <big snip>
> > >
> > > It looks to me as if you tried to think of all possible hypothetical
> > > situations
> > > and write a framework for that. Of course, you may know more than I do,
> > > and some of these situations actually happen.
> >
> > Not exactly. We tried to design a solution that would cover all
> requirements
> > for OUR (quite specific) embedded hardware. However we didn't want to tie
> it
> > only to our platform. We just generalized most of our requirements so
> they can
> > be reused for other systems.
> >
> > > The basic design ideas are solid, I think. But you should get rid of
> all
> > > the fancy features and go back to basics. We can always add those
> features
> > > later should that become necessary. But removing features is much, much
> > harder.
> >
> > Well, please keep in mind that we cannot remove features that are
> essential
> > for our solution. We know that a simple framework have some advantages
> > (well, the most important one is the fact that it is easy to understand),
> > but making it too simple would render it useless from our point of view
> > (if it would not provide functionality required by our drivers and
> hardware).
> 
> Why not give a list of the requirements that your hardware has? The only
> requirement unique to your hardware that I am aware of is the need for
> buffers in specific memory banks (and apparently a special memory area for
> firmware).

Yes, this is a real hardware requirement that cannot be dropped. Other
requirements come from the system design (software).

> So besides a 'dma' region (which is probably sufficient for most systems)
> you would also have a 'banka' and 'bankb' region and perhaps a 'fw' region.

Actually this DMA region is either just a 'banka' or 'bankb'. There is no
special requirements for the common dma memory and no need for the separate
'dma' region in our case. We just want to make it shared with at least camera
capture buffers.

> In the kernel parameters the user can specify the sizes for these regions
> that are reserved at boot time and drivers will use cma to get their memory
> from the relevant region using a simple allocator.
> 
> If there are other requirements in your hardware (or other hardware) that
> should be taken into account for an initial version of the cma, then I'm
> interested in hearing about it.

We will try to summarize our requirements again and repost them soon.
 
> One note: since the first version of the patch wasn't posted to linux-media
> I didn't follow the replies to that. If there are things in that thread
> that I should read, then just mail me some links.

I'm very sorry that the first version has been posted only to linux-mm.
The discussion was mainly about the way how the memory configuration and
device mapping should be provided (first version used only kernel parameters
for that). We should probably also CC a few more kernel subsystems that might
benefit from a common memory allocator, but we already were a bit scared by
the number of people and lists in this series. We really don't want to get
a 'spammer' label ;)

Best regards
--
Marek Szyprowski
Samsung Poland R&D Center

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
  2010-07-28  9:04             ` Marek Szyprowski
  (?)
@ 2010-08-01 13:26               ` Hans Verkuil
  -1 siblings, 0 replies; 57+ messages in thread
From: Hans Verkuil @ 2010-08-01 13:26 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Michal Nazarewicz, linux-mm, linux-kernel, linux-media,
	linux-arm-kernel, 'Hiremath Vaibhav',
	Pawel Osciak, 'Mark Brown', 'Daniel Walker',
	'Jonathan Corbet', 'FUJITA Tomonori',
	'Zach Pfeffer', 'Kyungmin Park'

Hi Marek,

On Wednesday 28 July 2010 11:04:56 Marek Szyprowski wrote:

<snip>

> > There are still desktops in the world, you know :-) It's nice if cma can
> > also be used there. Having a region called 'dma' or something like that
> > would be useful for drivers. And yes, for most embedded systems that would
> > be the equivalent of common memory.
> 
> I really see no problem using CMA on desktops, although in most cases only
> a few drivers will be able to make use of it.

As past maintainer of ivtv I can testify that I know at least one driver that
could definitely benefit from CMA.

> 
> > > > I think that the only thing that you want to set in the kernel params
> > is
> > > > the size of each region.
> > >
> > > Keeping it as a kernel parameter is very handy for development. But I
> > agree
> > > that we might make it dependent on some Kconfig entry. This way a
> > platform
> > > setup code would provide default region description just as an array of
> > the
> > > region structures and we will get rid of the parsing code in the release
> > > versions.
> > 
> > I think that would help, yes.
> > 
> > > > The same with assigning regions to drivers: why would you want to do
> > that?
> > > > The driver should know which regions it can use (with possible
> > fallbacks).
> > >
> > > I'm sorry, but this is again a little 'desktop-centric point-of-view'. On
> > > desktop it is perfectly acceptable to have a separate memory region for
> > each
> > > device. In embedded world memory is a precious resource. Of course we can
> > go
> > > the 'separate memory region for each device' way, but we observed that at
> > > least some memory can be recovered if we decide to share memory regions
> > for
> > > some of the devices.
> > >
> > > Assigning regions to the drivers is a way to describe how memory can be
> > > shared. This is something that is independent from the actual drivers.
> > > Device drivers cannot and mustn't have such knowledge.
> > 
> > You misunderstand me. I wasn't promoting separate memory regions for each
> > device.  Quite the opposite: initially I would make regions shared by default.
> > I actually do not see why device drivers can't have the knowledge to allocate
> > their memory from particular regions. We are talking about embedded systems 
> > where these types of memory allocations tend to be well defined. E.g. the
> > video capture driver needs to allocate X video buffers at boot time. For
> > most devices that will be the dma region, in the case of samsung devices
> > that will be regions for particular memory banks.
> 
> Let me introduce one more example. As you may know we have 3 video-processor
> capture devices (Samsung FIMC) and a hardware codec (like Samsung MFC). FIMC
> can capture video data from camera sensor and accelerate common video
> processing tasks (like up/down scaling and color space conversion). Two FIMC
> and MFC are require for things like HD video encoding or decoding with
> online display/preview. This task require huge video buffers that are
> usually allocated and freed at the same time. The third FIMC can be used for
> gfx acceleration (color space conversion and scaling are quite common tasks
> in GUI). This latter task usually introduces a lot of memory fragmentation,
> as gfx surfaces are usually quite small (definitely smaller than HD frames
> or 8MPix picture from camera). It would be really wise to get that third
> FIMC device to use memory buffer that will be shared with 3D accelerator
> (which has quite similar usage scenarios and suffers from similar memory
> fragmentation).

OK, I understand. And I assume both gfx and 3D acceleration need to use a
specific region? If they can use any type of memory, then this might be more
appropriate for kmalloc and friends.

> > The only time reserving regions for specific devices might be necessary is
> > if you can get memory fragmentation if multiple drivers use the same region.
> > But if drivers just allocate the memory at boot, then shared regions work
> > fine.
> > >
> > > > And it can know that provided regions are setup by the platform code
> > and
> > > > not created dynamically. This will simplify things enormously.
> > > >
> > > > > +    1. The core of the CMA does not handle allocation of memory and
> > > > > +       management of free space.  Dedicated allocators are used for
> > > > > +       that purpose.
> > > > > +
> > > > > +       This way, if the provided solution does not match demands
> > > > > +       imposed on a given system, one can develop a new algorithm
> > and
> > > > > +       easily plug it into the CMA framework.
> > > > > +
> > > > > +       The presented solution includes an implementation of a best-
> > fit
> > > > > +       algorithm.
> > > >
> > > > Again, do we really need user-settable per-region allocators? Just
> > provide
> > > > one with the option to later choose others through the kernel Kconfig
> > files.
> > >
> > > From our experience, yes. Different allocators can cope with different
> > memory
> > > usage scenarios better or worse. This results in higher or lower memory
> > > fragmentation. System use cases are something that kernel or drivers are
> > > definitely not aware, so only user space can tune this parameter to get
> > the
> > > best possible system behavior.
> > 
> > Can you describe some of those usage scenarios? The cases I am familiar
> > with
> > are the standard: 'need to allocate X buffers of Y MB memory each on boot'
> > and these buffers are only freed when the module is unloaded. While you can
> > get fragmentation in that case if you creatively load and unload modules,
> > this is not the normal usage in a deployed system.
> 
> We don't want to allocate X buffers of Y MB memory each on boot. Instead we
> want to just reserve XX MB memory and then dynamically allocate buffers from
> it. This enables us to perform the following 2 tasks:
> 1. movie decoding in HD-quality (only one instance)
> 2. two instances of SD-quality movie decoding and SD-quality move encoding 
>    (example: video conference)
> 
> We know that these two use cases are exclusive, so they can use the same
> reserved memory.

When I said 'allocating X buffers of Y MB memory' I meant that you need to
allocate a known amount of memory (X * Y MB in this case). So the boot args
say e.g. dma=40MB and the driver just allocates X buffers from that region.

Switching to SD quality requires releasing those buffers and instead allocating
a number of smaller buffers from the same region.

For these use-cases the allocator can be very simple and probably covers most
use-cases.

Anyway, I'm no expert on memory allocators and people on the linux-mm list are
no doubt much more qualified to discuss this. My main concern is that of
trying to add too much for a first release. It is simply easier to start simple
and extend as needed. That makes it easier to be accepted in the mainline.

> 
> > We already have several 'normal' memory allocators (slab, slub, slob, sleb,
> > whatever. I've lost track there.) and I'm not very enthusiastic about the
> > idea of adding another bunch of them. Especially not in a first release.
> >
> > Keep it simple. Just fix the core problem first, even if you are not yet
> > able to squeeze the last byte of memory from your system. Later you can 
> > always add features as needed and optimize it further.
> 
> We are trying to keep it as simple as possible :)
>  
> > > > We can always add more complex scenarios later, but for an initial
> > version
> > > > I'd keep it simple.
> > > >
> > > > > +
> > > > > +    2. CMA allows a run-time configuration of the memory regions it
> > > > > +       will use to allocate chunks of memory from.  The set of
> > memory
> > > > > +       regions is given on command line so it can be easily changed
> > > > > +       without the need for recompiling the kernel.
> > > > > +
> > > > > +       Each region has it's own size, alignment demand, a start
> > > > > +       address (physical address where it should be placed) and an
> > > > > +       allocator algorithm assigned to the region.
> > > > > +
> > > > > +       This means that there can be different algorithms running at
> > > > > +       the same time, if different devices on the platform have
> > > > > +       distinct memory usage characteristics and different algorithm
> > > > > +       match those the best way.
> > > >
> > > > Seems overengineering to me. Just ensure that the code can be extended
> > > > later to such hypothetical scenarios. They are hypothetical, right?
> > >
> > > Not really. Having the possibility to reconfigure memory configuration
> > > without kernel recompilation is very handy when one is tuning the
> > > configuration for the specific use case.
> > >
> > > > > +    3. When requesting memory, devices have to introduce themselves.
> > > > > +       This way CMA knows who the memory is allocated for.  This
> > > > > +       allows for the system architect to specify which memory
> > regions
> > > > > +       each device should use.
> > > > > +
> > > > > +       3a. Devices can also specify a "kind" of memory they want.
> > > > > +           This makes it possible to configure the system in such
> > > > > +           a way, that a single device may get memory from different
> > > > > +           memory regions, depending on the "kind" of memory it
> > > > > +           requested.  For example, a video codec driver might want
> > to
> > > > > +           allocate some shared buffers from the first memory bank
> > and
> > > > > +           the other from the second to get the highest possible
> > > > > +           memory throughput.
> > > >
> > > > Not sure I understand this. Isn't this just two regions, one for each
> > > > memory bank,
> > > > and the driver requests some buffers from one region and some from the
> > > > other?
> > >
> > > Right.
> > >
> > > > Not sure how a 'kind of memory' features in this.
> > >
> > > This 'kind' is a just cookie or a label used by the driver to distinguish
> > > requests for both memory banks. This functionality is essential for our
> > > hardware (just for hardware video codec we have 3 'kinds' of memory:
> > memory
> > > bank A, memory bank B and special region for the firmware).
> > 
> > Let me see if I understand: driver D needs memory from bank A, so it
> > specifies
> > e.g. label "a" when requesting memory. And the kernel parameters are then
> > supposed
> > to define a region for memory bank A (say region "bank-a" and tell the cma
> > that
> > requests from driver D for memory of kind "a" should go to region "bank-a".
> > 
> > Of course, the driver can also just request memory from the platform-
> > defined
> > region "bank-a" directly rather than having to rely on userspace provided
> > parameters.
> 
> Right. Regions provided by kernel command line are considered to be alternative
> to the default configuration for tuning/debugging purposes.
> 
> > > > > +    4. For greater flexibility and extensibility, the framework
> > allows
> > > > > +       device drivers to register private regions of reserved memory
> > > > > +       which then may be used only by them.
> > > > > +
> > > > > +       As an effect, if a driver would not use the rest of the CMA
> > > > > +       interface, it can still use CMA allocators and other
> > > > > +       mechanisms.
> > > >
> > > > Why would you? Is there an actual driver that will need this?
> > >
> > > This feature has been added after posting v1 of this rfc/patch. Jonathan
> > > Corbet suggested in
> > <http://article.gmane.org/gmane.linux.kernel.mm/50689>
> > > that viafb driver might register its own private memory and use cma just
> > > as an allocator. IMHO this is a good idea, this way we might remove a
> > bunch
> > > of custom allocators from the drivers (yes, there are such all over the
> > > kernel).
> > 
> > It goes back to whether we want per-device (or per-driver) regions or just
> > have global regions.
> 
> IMHO both types of regions can coexist and we can make use of their
> advantages. I see no point in limiting CMA only to one type of the regions.
> 
> > What I have seen in practice is that these drivers just
> > need X amount of contiguous memory on boot. Having just a single region (as
> > it will be for most systems) to carve the buffers from is just as efficient
> > if not more than creating separate regions for each driver. Only if you
> > start freeing and reallocating memory later on will you get into trouble.
> 
> > But if you do that, then you are trying to duplicate the behavior of the
> > normal allocators in my opinion. I really don't think we want to go there.
> 
> Well, then why some drivers like viafb or omapfb have custom memory
> allocators? This case shows that there is still a need for another one.

I have no idea why they have allocators. Someone would have to analyze that
code. Just because they use allocators doesn't necessarily mean that they
really need them :-)

> 
> > > > > +       4a. Early in boot process, device drivers can also request
> > the
> > > > > +           CMA framework to a reserve a region of memory for them
> > > > > +           which then will be used as a private region.
> > > > > +
> > > > > +           This way, drivers do not need to directly call bootmem,
> > > > > +           memblock or similar early allocator but merely register
> > an
> > > > > +           early region and the framework will handle the rest
> > > > > +           including choosing the right early allocator.
> > > >
> > > > The whole concept of private regions seems unnecessary to me.
> > > >
> > > > <big snip>
> > > >
> > > > It looks to me as if you tried to think of all possible hypothetical
> > > > situations
> > > > and write a framework for that. Of course, you may know more than I do,
> > > > and some of these situations actually happen.
> > >
> > > Not exactly. We tried to design a solution that would cover all
> > requirements
> > > for OUR (quite specific) embedded hardware. However we didn't want to tie
> > it
> > > only to our platform. We just generalized most of our requirements so
> > they can
> > > be reused for other systems.
> > >
> > > > The basic design ideas are solid, I think. But you should get rid of
> > all
> > > > the fancy features and go back to basics. We can always add those
> > features
> > > > later should that become necessary. But removing features is much, much
> > > harder.
> > >
> > > Well, please keep in mind that we cannot remove features that are
> > essential
> > > for our solution. We know that a simple framework have some advantages
> > > (well, the most important one is the fact that it is easy to understand),
> > > but making it too simple would render it useless from our point of view
> > > (if it would not provide functionality required by our drivers and
> > hardware).
> > 
> > Why not give a list of the requirements that your hardware has? The only
> > requirement unique to your hardware that I am aware of is the need for
> > buffers in specific memory banks (and apparently a special memory area for
> > firmware).
> 
> Yes, this is a real hardware requirement that cannot be dropped. Other
> requirements come from the system design (software).
> 
> > So besides a 'dma' region (which is probably sufficient for most systems)
> > you would also have a 'banka' and 'bankb' region and perhaps a 'fw' region.
> 
> Actually this DMA region is either just a 'banka' or 'bankb'. There is no
> special requirements for the common dma memory and no need for the separate
> 'dma' region in our case. We just want to make it shared with at least camera
> capture buffers.
> 
> > In the kernel parameters the user can specify the sizes for these regions
> > that are reserved at boot time and drivers will use cma to get their memory
> > from the relevant region using a simple allocator.
> > 
> > If there are other requirements in your hardware (or other hardware) that
> > should be taken into account for an initial version of the cma, then I'm
> > interested in hearing about it.
> 
> We will try to summarize our requirements again and repost them soon.
>  
> > One note: since the first version of the patch wasn't posted to linux-media
> > I didn't follow the replies to that. If there are things in that thread
> > that I should read, then just mail me some links.
> 
> I'm very sorry that the first version has been posted only to linux-mm.
> The discussion was mainly about the way how the memory configuration and
> device mapping should be provided (first version used only kernel parameters
> for that). We should probably also CC a few more kernel subsystems that might
> benefit from a common memory allocator, but we already were a bit scared by
> the number of people and lists in this series. We really don't want to get
> a 'spammer' label ;)

You probably got the main interested parties now.

Regarding regions and shared and per-driver buffers: I've been thinking about this
a bit more and I have a proposal of my own.

There are two different aspects to this: first there is the hardware aspect: if
the hardware needs memory from specific memory banks or with specific requirements
(e.g. DMAable), then those regions should be setup in the platform code. There you
know the memory sizes/alignments/etc. since that is hw dependent. The other reason
is that drivers need to be able to tell CMA that they need to allocate from such
regions. You can't have a driver refer to a region that is specified through
kernel parameters, that would create a very ugly dependency.

The other aspect is how to setup buffers. A global buffer is simply setup by
assigning a size to the region: "banka=20MB". Unless specified otherwise any
driver that needs memory from banka will use that global banka buffer.

Alternatively, you can set aside memory from a region specifically for drivers:
banka/foo=30MB. This allocated 30 MB from region banka specifically for driver foo.

You can also share that with another driver:

banka/foo,bar=30MB

Now this 30 MB buffer is shared between drivers foo and bar.

The nice thing about this is that the driver will still only refer to region
banka as setup by the platform code.

And in the more general case you can have two standard regions: dma and common.
So drivers can rely on the presence of a dma region when allocating buffers.

What would make this even better is that CMA has the option to try and allocate
additional memory on the fly if its memory pool becomes empty. E.g. if the dma
pool is full, then it can try to do a kmalloc(..., GFP_KERNEL | __GFP_DMA).

This allows you to setup the dma and common regions with size 0. So allocating
from the dma region would effectively be the same as doing a kmalloc. Unless
the user sets up a dma area in the kernel parameters.

Obviously this is probably impossible if you need memory from specific memory
banks, so this is something that is not available for every region.

The nice thing about this is that it is very flexible for end users. For example,
most users of the ivtv driver wouldn't have to do anything since most of the time
it is able to assign the necessary buffers. But some users have multiple ivtv-based
capture boards in their PC, and then it can become harder to have ivtv obtain the
needed buffers. In that case they can preallocate the buffers by setting
dma/ivtv=500MB or something like that.

That would be a really nice feature...

Regards,

	Hans

-- 
Hans Verkuil - video4linux developer - sponsored by TANDBERG, part of Cisco

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-08-01 13:26               ` Hans Verkuil
  0 siblings, 0 replies; 57+ messages in thread
From: Hans Verkuil @ 2010-08-01 13:26 UTC (permalink / raw)
  To: Marek Szyprowski
  Cc: Michal Nazarewicz, linux-mm, linux-kernel, linux-media,
	linux-arm-kernel, 'Hiremath Vaibhav',
	Pawel Osciak, 'Mark Brown', 'Daniel Walker',
	'Jonathan Corbet', 'FUJITA Tomonori',
	'Zach Pfeffer', 'Kyungmin Park'

Hi Marek,

On Wednesday 28 July 2010 11:04:56 Marek Szyprowski wrote:

<snip>

> > There are still desktops in the world, you know :-) It's nice if cma can
> > also be used there. Having a region called 'dma' or something like that
> > would be useful for drivers. And yes, for most embedded systems that would
> > be the equivalent of common memory.
> 
> I really see no problem using CMA on desktops, although in most cases only
> a few drivers will be able to make use of it.

As past maintainer of ivtv I can testify that I know at least one driver that
could definitely benefit from CMA.

> 
> > > > I think that the only thing that you want to set in the kernel params
> > is
> > > > the size of each region.
> > >
> > > Keeping it as a kernel parameter is very handy for development. But I
> > agree
> > > that we might make it dependent on some Kconfig entry. This way a
> > platform
> > > setup code would provide default region description just as an array of
> > the
> > > region structures and we will get rid of the parsing code in the release
> > > versions.
> > 
> > I think that would help, yes.
> > 
> > > > The same with assigning regions to drivers: why would you want to do
> > that?
> > > > The driver should know which regions it can use (with possible
> > fallbacks).
> > >
> > > I'm sorry, but this is again a little 'desktop-centric point-of-view'. On
> > > desktop it is perfectly acceptable to have a separate memory region for
> > each
> > > device. In embedded world memory is a precious resource. Of course we can
> > go
> > > the 'separate memory region for each device' way, but we observed that at
> > > least some memory can be recovered if we decide to share memory regions
> > for
> > > some of the devices.
> > >
> > > Assigning regions to the drivers is a way to describe how memory can be
> > > shared. This is something that is independent from the actual drivers.
> > > Device drivers cannot and mustn't have such knowledge.
> > 
> > You misunderstand me. I wasn't promoting separate memory regions for each
> > device.  Quite the opposite: initially I would make regions shared by default.
> > I actually do not see why device drivers can't have the knowledge to allocate
> > their memory from particular regions. We are talking about embedded systems 
> > where these types of memory allocations tend to be well defined. E.g. the
> > video capture driver needs to allocate X video buffers at boot time. For
> > most devices that will be the dma region, in the case of samsung devices
> > that will be regions for particular memory banks.
> 
> Let me introduce one more example. As you may know we have 3 video-processor
> capture devices (Samsung FIMC) and a hardware codec (like Samsung MFC). FIMC
> can capture video data from camera sensor and accelerate common video
> processing tasks (like up/down scaling and color space conversion). Two FIMC
> and MFC are require for things like HD video encoding or decoding with
> online display/preview. This task require huge video buffers that are
> usually allocated and freed at the same time. The third FIMC can be used for
> gfx acceleration (color space conversion and scaling are quite common tasks
> in GUI). This latter task usually introduces a lot of memory fragmentation,
> as gfx surfaces are usually quite small (definitely smaller than HD frames
> or 8MPix picture from camera). It would be really wise to get that third
> FIMC device to use memory buffer that will be shared with 3D accelerator
> (which has quite similar usage scenarios and suffers from similar memory
> fragmentation).

OK, I understand. And I assume both gfx and 3D acceleration need to use a
specific region? If they can use any type of memory, then this might be more
appropriate for kmalloc and friends.

> > The only time reserving regions for specific devices might be necessary is
> > if you can get memory fragmentation if multiple drivers use the same region.
> > But if drivers just allocate the memory at boot, then shared regions work
> > fine.
> > >
> > > > And it can know that provided regions are setup by the platform code
> > and
> > > > not created dynamically. This will simplify things enormously.
> > > >
> > > > > +    1. The core of the CMA does not handle allocation of memory and
> > > > > +       management of free space.  Dedicated allocators are used for
> > > > > +       that purpose.
> > > > > +
> > > > > +       This way, if the provided solution does not match demands
> > > > > +       imposed on a given system, one can develop a new algorithm
> > and
> > > > > +       easily plug it into the CMA framework.
> > > > > +
> > > > > +       The presented solution includes an implementation of a best-
> > fit
> > > > > +       algorithm.
> > > >
> > > > Again, do we really need user-settable per-region allocators? Just
> > provide
> > > > one with the option to later choose others through the kernel Kconfig
> > files.
> > >
> > > From our experience, yes. Different allocators can cope with different
> > memory
> > > usage scenarios better or worse. This results in higher or lower memory
> > > fragmentation. System use cases are something that kernel or drivers are
> > > definitely not aware, so only user space can tune this parameter to get
> > the
> > > best possible system behavior.
> > 
> > Can you describe some of those usage scenarios? The cases I am familiar
> > with
> > are the standard: 'need to allocate X buffers of Y MB memory each on boot'
> > and these buffers are only freed when the module is unloaded. While you can
> > get fragmentation in that case if you creatively load and unload modules,
> > this is not the normal usage in a deployed system.
> 
> We don't want to allocate X buffers of Y MB memory each on boot. Instead we
> want to just reserve XX MB memory and then dynamically allocate buffers from
> it. This enables us to perform the following 2 tasks:
> 1. movie decoding in HD-quality (only one instance)
> 2. two instances of SD-quality movie decoding and SD-quality move encoding 
>    (example: video conference)
> 
> We know that these two use cases are exclusive, so they can use the same
> reserved memory.

When I said 'allocating X buffers of Y MB memory' I meant that you need to
allocate a known amount of memory (X * Y MB in this case). So the boot args
say e.g. dma=40MB and the driver just allocates X buffers from that region.

Switching to SD quality requires releasing those buffers and instead allocating
a number of smaller buffers from the same region.

For these use-cases the allocator can be very simple and probably covers most
use-cases.

Anyway, I'm no expert on memory allocators and people on the linux-mm list are
no doubt much more qualified to discuss this. My main concern is that of
trying to add too much for a first release. It is simply easier to start simple
and extend as needed. That makes it easier to be accepted in the mainline.

> 
> > We already have several 'normal' memory allocators (slab, slub, slob, sleb,
> > whatever. I've lost track there.) and I'm not very enthusiastic about the
> > idea of adding another bunch of them. Especially not in a first release.
> >
> > Keep it simple. Just fix the core problem first, even if you are not yet
> > able to squeeze the last byte of memory from your system. Later you can 
> > always add features as needed and optimize it further.
> 
> We are trying to keep it as simple as possible :)
>  
> > > > We can always add more complex scenarios later, but for an initial
> > version
> > > > I'd keep it simple.
> > > >
> > > > > +
> > > > > +    2. CMA allows a run-time configuration of the memory regions it
> > > > > +       will use to allocate chunks of memory from.  The set of
> > memory
> > > > > +       regions is given on command line so it can be easily changed
> > > > > +       without the need for recompiling the kernel.
> > > > > +
> > > > > +       Each region has it's own size, alignment demand, a start
> > > > > +       address (physical address where it should be placed) and an
> > > > > +       allocator algorithm assigned to the region.
> > > > > +
> > > > > +       This means that there can be different algorithms running at
> > > > > +       the same time, if different devices on the platform have
> > > > > +       distinct memory usage characteristics and different algorithm
> > > > > +       match those the best way.
> > > >
> > > > Seems overengineering to me. Just ensure that the code can be extended
> > > > later to such hypothetical scenarios. They are hypothetical, right?
> > >
> > > Not really. Having the possibility to reconfigure memory configuration
> > > without kernel recompilation is very handy when one is tuning the
> > > configuration for the specific use case.
> > >
> > > > > +    3. When requesting memory, devices have to introduce themselves.
> > > > > +       This way CMA knows who the memory is allocated for.  This
> > > > > +       allows for the system architect to specify which memory
> > regions
> > > > > +       each device should use.
> > > > > +
> > > > > +       3a. Devices can also specify a "kind" of memory they want.
> > > > > +           This makes it possible to configure the system in such
> > > > > +           a way, that a single device may get memory from different
> > > > > +           memory regions, depending on the "kind" of memory it
> > > > > +           requested.  For example, a video codec driver might want
> > to
> > > > > +           allocate some shared buffers from the first memory bank
> > and
> > > > > +           the other from the second to get the highest possible
> > > > > +           memory throughput.
> > > >
> > > > Not sure I understand this. Isn't this just two regions, one for each
> > > > memory bank,
> > > > and the driver requests some buffers from one region and some from the
> > > > other?
> > >
> > > Right.
> > >
> > > > Not sure how a 'kind of memory' features in this.
> > >
> > > This 'kind' is a just cookie or a label used by the driver to distinguish
> > > requests for both memory banks. This functionality is essential for our
> > > hardware (just for hardware video codec we have 3 'kinds' of memory:
> > memory
> > > bank A, memory bank B and special region for the firmware).
> > 
> > Let me see if I understand: driver D needs memory from bank A, so it
> > specifies
> > e.g. label "a" when requesting memory. And the kernel parameters are then
> > supposed
> > to define a region for memory bank A (say region "bank-a" and tell the cma
> > that
> > requests from driver D for memory of kind "a" should go to region "bank-a".
> > 
> > Of course, the driver can also just request memory from the platform-
> > defined
> > region "bank-a" directly rather than having to rely on userspace provided
> > parameters.
> 
> Right. Regions provided by kernel command line are considered to be alternative
> to the default configuration for tuning/debugging purposes.
> 
> > > > > +    4. For greater flexibility and extensibility, the framework
> > allows
> > > > > +       device drivers to register private regions of reserved memory
> > > > > +       which then may be used only by them.
> > > > > +
> > > > > +       As an effect, if a driver would not use the rest of the CMA
> > > > > +       interface, it can still use CMA allocators and other
> > > > > +       mechanisms.
> > > >
> > > > Why would you? Is there an actual driver that will need this?
> > >
> > > This feature has been added after posting v1 of this rfc/patch. Jonathan
> > > Corbet suggested in
> > <http://article.gmane.org/gmane.linux.kernel.mm/50689>
> > > that viafb driver might register its own private memory and use cma just
> > > as an allocator. IMHO this is a good idea, this way we might remove a
> > bunch
> > > of custom allocators from the drivers (yes, there are such all over the
> > > kernel).
> > 
> > It goes back to whether we want per-device (or per-driver) regions or just
> > have global regions.
> 
> IMHO both types of regions can coexist and we can make use of their
> advantages. I see no point in limiting CMA only to one type of the regions.
> 
> > What I have seen in practice is that these drivers just
> > need X amount of contiguous memory on boot. Having just a single region (as
> > it will be for most systems) to carve the buffers from is just as efficient
> > if not more than creating separate regions for each driver. Only if you
> > start freeing and reallocating memory later on will you get into trouble.
> 
> > But if you do that, then you are trying to duplicate the behavior of the
> > normal allocators in my opinion. I really don't think we want to go there.
> 
> Well, then why some drivers like viafb or omapfb have custom memory
> allocators? This case shows that there is still a need for another one.

I have no idea why they have allocators. Someone would have to analyze that
code. Just because they use allocators doesn't necessarily mean that they
really need them :-)

> 
> > > > > +       4a. Early in boot process, device drivers can also request
> > the
> > > > > +           CMA framework to a reserve a region of memory for them
> > > > > +           which then will be used as a private region.
> > > > > +
> > > > > +           This way, drivers do not need to directly call bootmem,
> > > > > +           memblock or similar early allocator but merely register
> > an
> > > > > +           early region and the framework will handle the rest
> > > > > +           including choosing the right early allocator.
> > > >
> > > > The whole concept of private regions seems unnecessary to me.
> > > >
> > > > <big snip>
> > > >
> > > > It looks to me as if you tried to think of all possible hypothetical
> > > > situations
> > > > and write a framework for that. Of course, you may know more than I do,
> > > > and some of these situations actually happen.
> > >
> > > Not exactly. We tried to design a solution that would cover all
> > requirements
> > > for OUR (quite specific) embedded hardware. However we didn't want to tie
> > it
> > > only to our platform. We just generalized most of our requirements so
> > they can
> > > be reused for other systems.
> > >
> > > > The basic design ideas are solid, I think. But you should get rid of
> > all
> > > > the fancy features and go back to basics. We can always add those
> > features
> > > > later should that become necessary. But removing features is much, much
> > > harder.
> > >
> > > Well, please keep in mind that we cannot remove features that are
> > essential
> > > for our solution. We know that a simple framework have some advantages
> > > (well, the most important one is the fact that it is easy to understand),
> > > but making it too simple would render it useless from our point of view
> > > (if it would not provide functionality required by our drivers and
> > hardware).
> > 
> > Why not give a list of the requirements that your hardware has? The only
> > requirement unique to your hardware that I am aware of is the need for
> > buffers in specific memory banks (and apparently a special memory area for
> > firmware).
> 
> Yes, this is a real hardware requirement that cannot be dropped. Other
> requirements come from the system design (software).
> 
> > So besides a 'dma' region (which is probably sufficient for most systems)
> > you would also have a 'banka' and 'bankb' region and perhaps a 'fw' region.
> 
> Actually this DMA region is either just a 'banka' or 'bankb'. There is no
> special requirements for the common dma memory and no need for the separate
> 'dma' region in our case. We just want to make it shared with at least camera
> capture buffers.
> 
> > In the kernel parameters the user can specify the sizes for these regions
> > that are reserved at boot time and drivers will use cma to get their memory
> > from the relevant region using a simple allocator.
> > 
> > If there are other requirements in your hardware (or other hardware) that
> > should be taken into account for an initial version of the cma, then I'm
> > interested in hearing about it.
> 
> We will try to summarize our requirements again and repost them soon.
>  
> > One note: since the first version of the patch wasn't posted to linux-media
> > I didn't follow the replies to that. If there are things in that thread
> > that I should read, then just mail me some links.
> 
> I'm very sorry that the first version has been posted only to linux-mm.
> The discussion was mainly about the way how the memory configuration and
> device mapping should be provided (first version used only kernel parameters
> for that). We should probably also CC a few more kernel subsystems that might
> benefit from a common memory allocator, but we already were a bit scared by
> the number of people and lists in this series. We really don't want to get
> a 'spammer' label ;)

You probably got the main interested parties now.

Regarding regions and shared and per-driver buffers: I've been thinking about this
a bit more and I have a proposal of my own.

There are two different aspects to this: first there is the hardware aspect: if
the hardware needs memory from specific memory banks or with specific requirements
(e.g. DMAable), then those regions should be setup in the platform code. There you
know the memory sizes/alignments/etc. since that is hw dependent. The other reason
is that drivers need to be able to tell CMA that they need to allocate from such
regions. You can't have a driver refer to a region that is specified through
kernel parameters, that would create a very ugly dependency.

The other aspect is how to setup buffers. A global buffer is simply setup by
assigning a size to the region: "banka=20MB". Unless specified otherwise any
driver that needs memory from banka will use that global banka buffer.

Alternatively, you can set aside memory from a region specifically for drivers:
banka/foo=30MB. This allocated 30 MB from region banka specifically for driver foo.

You can also share that with another driver:

banka/foo,bar=30MB

Now this 30 MB buffer is shared between drivers foo and bar.

The nice thing about this is that the driver will still only refer to region
banka as setup by the platform code.

And in the more general case you can have two standard regions: dma and common.
So drivers can rely on the presence of a dma region when allocating buffers.

What would make this even better is that CMA has the option to try and allocate
additional memory on the fly if its memory pool becomes empty. E.g. if the dma
pool is full, then it can try to do a kmalloc(..., GFP_KERNEL | __GFP_DMA).

This allows you to setup the dma and common regions with size 0. So allocating
from the dma region would effectively be the same as doing a kmalloc. Unless
the user sets up a dma area in the kernel parameters.

Obviously this is probably impossible if you need memory from specific memory
banks, so this is something that is not available for every region.

The nice thing about this is that it is very flexible for end users. For example,
most users of the ivtv driver wouldn't have to do anything since most of the time
it is able to assign the necessary buffers. But some users have multiple ivtv-based
capture boards in their PC, and then it can become harder to have ivtv obtain the
needed buffers. In that case they can preallocate the buffers by setting
dma/ivtv=500MB or something like that.

That would be a really nice feature...

Regards,

	Hans

-- 
Hans Verkuil - video4linux developer - sponsored by TANDBERG, part of Cisco

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-08-01 13:26               ` Hans Verkuil
  0 siblings, 0 replies; 57+ messages in thread
From: Hans Verkuil @ 2010-08-01 13:26 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Marek,

On Wednesday 28 July 2010 11:04:56 Marek Szyprowski wrote:

<snip>

> > There are still desktops in the world, you know :-) It's nice if cma can
> > also be used there. Having a region called 'dma' or something like that
> > would be useful for drivers. And yes, for most embedded systems that would
> > be the equivalent of common memory.
> 
> I really see no problem using CMA on desktops, although in most cases only
> a few drivers will be able to make use of it.

As past maintainer of ivtv I can testify that I know at least one driver that
could definitely benefit from CMA.

> 
> > > > I think that the only thing that you want to set in the kernel params
> > is
> > > > the size of each region.
> > >
> > > Keeping it as a kernel parameter is very handy for development. But I
> > agree
> > > that we might make it dependent on some Kconfig entry. This way a
> > platform
> > > setup code would provide default region description just as an array of
> > the
> > > region structures and we will get rid of the parsing code in the release
> > > versions.
> > 
> > I think that would help, yes.
> > 
> > > > The same with assigning regions to drivers: why would you want to do
> > that?
> > > > The driver should know which regions it can use (with possible
> > fallbacks).
> > >
> > > I'm sorry, but this is again a little 'desktop-centric point-of-view'. On
> > > desktop it is perfectly acceptable to have a separate memory region for
> > each
> > > device. In embedded world memory is a precious resource. Of course we can
> > go
> > > the 'separate memory region for each device' way, but we observed that at
> > > least some memory can be recovered if we decide to share memory regions
> > for
> > > some of the devices.
> > >
> > > Assigning regions to the drivers is a way to describe how memory can be
> > > shared. This is something that is independent from the actual drivers.
> > > Device drivers cannot and mustn't have such knowledge.
> > 
> > You misunderstand me. I wasn't promoting separate memory regions for each
> > device.  Quite the opposite: initially I would make regions shared by default.
> > I actually do not see why device drivers can't have the knowledge to allocate
> > their memory from particular regions. We are talking about embedded systems 
> > where these types of memory allocations tend to be well defined. E.g. the
> > video capture driver needs to allocate X video buffers at boot time. For
> > most devices that will be the dma region, in the case of samsung devices
> > that will be regions for particular memory banks.
> 
> Let me introduce one more example. As you may know we have 3 video-processor
> capture devices (Samsung FIMC) and a hardware codec (like Samsung MFC). FIMC
> can capture video data from camera sensor and accelerate common video
> processing tasks (like up/down scaling and color space conversion). Two FIMC
> and MFC are require for things like HD video encoding or decoding with
> online display/preview. This task require huge video buffers that are
> usually allocated and freed at the same time. The third FIMC can be used for
> gfx acceleration (color space conversion and scaling are quite common tasks
> in GUI). This latter task usually introduces a lot of memory fragmentation,
> as gfx surfaces are usually quite small (definitely smaller than HD frames
> or 8MPix picture from camera). It would be really wise to get that third
> FIMC device to use memory buffer that will be shared with 3D accelerator
> (which has quite similar usage scenarios and suffers from similar memory
> fragmentation).

OK, I understand. And I assume both gfx and 3D acceleration need to use a
specific region? If they can use any type of memory, then this might be more
appropriate for kmalloc and friends.

> > The only time reserving regions for specific devices might be necessary is
> > if you can get memory fragmentation if multiple drivers use the same region.
> > But if drivers just allocate the memory at boot, then shared regions work
> > fine.
> > >
> > > > And it can know that provided regions are setup by the platform code
> > and
> > > > not created dynamically. This will simplify things enormously.
> > > >
> > > > > +    1. The core of the CMA does not handle allocation of memory and
> > > > > +       management of free space.  Dedicated allocators are used for
> > > > > +       that purpose.
> > > > > +
> > > > > +       This way, if the provided solution does not match demands
> > > > > +       imposed on a given system, one can develop a new algorithm
> > and
> > > > > +       easily plug it into the CMA framework.
> > > > > +
> > > > > +       The presented solution includes an implementation of a best-
> > fit
> > > > > +       algorithm.
> > > >
> > > > Again, do we really need user-settable per-region allocators? Just
> > provide
> > > > one with the option to later choose others through the kernel Kconfig
> > files.
> > >
> > > From our experience, yes. Different allocators can cope with different
> > memory
> > > usage scenarios better or worse. This results in higher or lower memory
> > > fragmentation. System use cases are something that kernel or drivers are
> > > definitely not aware, so only user space can tune this parameter to get
> > the
> > > best possible system behavior.
> > 
> > Can you describe some of those usage scenarios? The cases I am familiar
> > with
> > are the standard: 'need to allocate X buffers of Y MB memory each on boot'
> > and these buffers are only freed when the module is unloaded. While you can
> > get fragmentation in that case if you creatively load and unload modules,
> > this is not the normal usage in a deployed system.
> 
> We don't want to allocate X buffers of Y MB memory each on boot. Instead we
> want to just reserve XX MB memory and then dynamically allocate buffers from
> it. This enables us to perform the following 2 tasks:
> 1. movie decoding in HD-quality (only one instance)
> 2. two instances of SD-quality movie decoding and SD-quality move encoding 
>    (example: video conference)
> 
> We know that these two use cases are exclusive, so they can use the same
> reserved memory.

When I said 'allocating X buffers of Y MB memory' I meant that you need to
allocate a known amount of memory (X * Y MB in this case). So the boot args
say e.g. dma=40MB and the driver just allocates X buffers from that region.

Switching to SD quality requires releasing those buffers and instead allocating
a number of smaller buffers from the same region.

For these use-cases the allocator can be very simple and probably covers most
use-cases.

Anyway, I'm no expert on memory allocators and people on the linux-mm list are
no doubt much more qualified to discuss this. My main concern is that of
trying to add too much for a first release. It is simply easier to start simple
and extend as needed. That makes it easier to be accepted in the mainline.

> 
> > We already have several 'normal' memory allocators (slab, slub, slob, sleb,
> > whatever. I've lost track there.) and I'm not very enthusiastic about the
> > idea of adding another bunch of them. Especially not in a first release.
> >
> > Keep it simple. Just fix the core problem first, even if you are not yet
> > able to squeeze the last byte of memory from your system. Later you can 
> > always add features as needed and optimize it further.
> 
> We are trying to keep it as simple as possible :)
>  
> > > > We can always add more complex scenarios later, but for an initial
> > version
> > > > I'd keep it simple.
> > > >
> > > > > +
> > > > > +    2. CMA allows a run-time configuration of the memory regions it
> > > > > +       will use to allocate chunks of memory from.  The set of
> > memory
> > > > > +       regions is given on command line so it can be easily changed
> > > > > +       without the need for recompiling the kernel.
> > > > > +
> > > > > +       Each region has it's own size, alignment demand, a start
> > > > > +       address (physical address where it should be placed) and an
> > > > > +       allocator algorithm assigned to the region.
> > > > > +
> > > > > +       This means that there can be different algorithms running at
> > > > > +       the same time, if different devices on the platform have
> > > > > +       distinct memory usage characteristics and different algorithm
> > > > > +       match those the best way.
> > > >
> > > > Seems overengineering to me. Just ensure that the code can be extended
> > > > later to such hypothetical scenarios. They are hypothetical, right?
> > >
> > > Not really. Having the possibility to reconfigure memory configuration
> > > without kernel recompilation is very handy when one is tuning the
> > > configuration for the specific use case.
> > >
> > > > > +    3. When requesting memory, devices have to introduce themselves.
> > > > > +       This way CMA knows who the memory is allocated for.  This
> > > > > +       allows for the system architect to specify which memory
> > regions
> > > > > +       each device should use.
> > > > > +
> > > > > +       3a. Devices can also specify a "kind" of memory they want.
> > > > > +           This makes it possible to configure the system in such
> > > > > +           a way, that a single device may get memory from different
> > > > > +           memory regions, depending on the "kind" of memory it
> > > > > +           requested.  For example, a video codec driver might want
> > to
> > > > > +           allocate some shared buffers from the first memory bank
> > and
> > > > > +           the other from the second to get the highest possible
> > > > > +           memory throughput.
> > > >
> > > > Not sure I understand this. Isn't this just two regions, one for each
> > > > memory bank,
> > > > and the driver requests some buffers from one region and some from the
> > > > other?
> > >
> > > Right.
> > >
> > > > Not sure how a 'kind of memory' features in this.
> > >
> > > This 'kind' is a just cookie or a label used by the driver to distinguish
> > > requests for both memory banks. This functionality is essential for our
> > > hardware (just for hardware video codec we have 3 'kinds' of memory:
> > memory
> > > bank A, memory bank B and special region for the firmware).
> > 
> > Let me see if I understand: driver D needs memory from bank A, so it
> > specifies
> > e.g. label "a" when requesting memory. And the kernel parameters are then
> > supposed
> > to define a region for memory bank A (say region "bank-a" and tell the cma
> > that
> > requests from driver D for memory of kind "a" should go to region "bank-a".
> > 
> > Of course, the driver can also just request memory from the platform-
> > defined
> > region "bank-a" directly rather than having to rely on userspace provided
> > parameters.
> 
> Right. Regions provided by kernel command line are considered to be alternative
> to the default configuration for tuning/debugging purposes.
> 
> > > > > +    4. For greater flexibility and extensibility, the framework
> > allows
> > > > > +       device drivers to register private regions of reserved memory
> > > > > +       which then may be used only by them.
> > > > > +
> > > > > +       As an effect, if a driver would not use the rest of the CMA
> > > > > +       interface, it can still use CMA allocators and other
> > > > > +       mechanisms.
> > > >
> > > > Why would you? Is there an actual driver that will need this?
> > >
> > > This feature has been added after posting v1 of this rfc/patch. Jonathan
> > > Corbet suggested in
> > <http://article.gmane.org/gmane.linux.kernel.mm/50689>
> > > that viafb driver might register its own private memory and use cma just
> > > as an allocator. IMHO this is a good idea, this way we might remove a
> > bunch
> > > of custom allocators from the drivers (yes, there are such all over the
> > > kernel).
> > 
> > It goes back to whether we want per-device (or per-driver) regions or just
> > have global regions.
> 
> IMHO both types of regions can coexist and we can make use of their
> advantages. I see no point in limiting CMA only to one type of the regions.
> 
> > What I have seen in practice is that these drivers just
> > need X amount of contiguous memory on boot. Having just a single region (as
> > it will be for most systems) to carve the buffers from is just as efficient
> > if not more than creating separate regions for each driver. Only if you
> > start freeing and reallocating memory later on will you get into trouble.
> 
> > But if you do that, then you are trying to duplicate the behavior of the
> > normal allocators in my opinion. I really don't think we want to go there.
> 
> Well, then why some drivers like viafb or omapfb have custom memory
> allocators? This case shows that there is still a need for another one.

I have no idea why they have allocators. Someone would have to analyze that
code. Just because they use allocators doesn't necessarily mean that they
really need them :-)

> 
> > > > > +       4a. Early in boot process, device drivers can also request
> > the
> > > > > +           CMA framework to a reserve a region of memory for them
> > > > > +           which then will be used as a private region.
> > > > > +
> > > > > +           This way, drivers do not need to directly call bootmem,
> > > > > +           memblock or similar early allocator but merely register
> > an
> > > > > +           early region and the framework will handle the rest
> > > > > +           including choosing the right early allocator.
> > > >
> > > > The whole concept of private regions seems unnecessary to me.
> > > >
> > > > <big snip>
> > > >
> > > > It looks to me as if you tried to think of all possible hypothetical
> > > > situations
> > > > and write a framework for that. Of course, you may know more than I do,
> > > > and some of these situations actually happen.
> > >
> > > Not exactly. We tried to design a solution that would cover all
> > requirements
> > > for OUR (quite specific) embedded hardware. However we didn't want to tie
> > it
> > > only to our platform. We just generalized most of our requirements so
> > they can
> > > be reused for other systems.
> > >
> > > > The basic design ideas are solid, I think. But you should get rid of
> > all
> > > > the fancy features and go back to basics. We can always add those
> > features
> > > > later should that become necessary. But removing features is much, much
> > > harder.
> > >
> > > Well, please keep in mind that we cannot remove features that are
> > essential
> > > for our solution. We know that a simple framework have some advantages
> > > (well, the most important one is the fact that it is easy to understand),
> > > but making it too simple would render it useless from our point of view
> > > (if it would not provide functionality required by our drivers and
> > hardware).
> > 
> > Why not give a list of the requirements that your hardware has? The only
> > requirement unique to your hardware that I am aware of is the need for
> > buffers in specific memory banks (and apparently a special memory area for
> > firmware).
> 
> Yes, this is a real hardware requirement that cannot be dropped. Other
> requirements come from the system design (software).
> 
> > So besides a 'dma' region (which is probably sufficient for most systems)
> > you would also have a 'banka' and 'bankb' region and perhaps a 'fw' region.
> 
> Actually this DMA region is either just a 'banka' or 'bankb'. There is no
> special requirements for the common dma memory and no need for the separate
> 'dma' region in our case. We just want to make it shared with at least camera
> capture buffers.
> 
> > In the kernel parameters the user can specify the sizes for these regions
> > that are reserved at boot time and drivers will use cma to get their memory
> > from the relevant region using a simple allocator.
> > 
> > If there are other requirements in your hardware (or other hardware) that
> > should be taken into account for an initial version of the cma, then I'm
> > interested in hearing about it.
> 
> We will try to summarize our requirements again and repost them soon.
>  
> > One note: since the first version of the patch wasn't posted to linux-media
> > I didn't follow the replies to that. If there are things in that thread
> > that I should read, then just mail me some links.
> 
> I'm very sorry that the first version has been posted only to linux-mm.
> The discussion was mainly about the way how the memory configuration and
> device mapping should be provided (first version used only kernel parameters
> for that). We should probably also CC a few more kernel subsystems that might
> benefit from a common memory allocator, but we already were a bit scared by
> the number of people and lists in this series. We really don't want to get
> a 'spammer' label ;)

You probably got the main interested parties now.

Regarding regions and shared and per-driver buffers: I've been thinking about this
a bit more and I have a proposal of my own.

There are two different aspects to this: first there is the hardware aspect: if
the hardware needs memory from specific memory banks or with specific requirements
(e.g. DMAable), then those regions should be setup in the platform code. There you
know the memory sizes/alignments/etc. since that is hw dependent. The other reason
is that drivers need to be able to tell CMA that they need to allocate from such
regions. You can't have a driver refer to a region that is specified through
kernel parameters, that would create a very ugly dependency.

The other aspect is how to setup buffers. A global buffer is simply setup by
assigning a size to the region: "banka=20MB". Unless specified otherwise any
driver that needs memory from banka will use that global banka buffer.

Alternatively, you can set aside memory from a region specifically for drivers:
banka/foo=30MB. This allocated 30 MB from region banka specifically for driver foo.

You can also share that with another driver:

banka/foo,bar=30MB

Now this 30 MB buffer is shared between drivers foo and bar.

The nice thing about this is that the driver will still only refer to region
banka as setup by the platform code.

And in the more general case you can have two standard regions: dma and common.
So drivers can rely on the presence of a dma region when allocating buffers.

What would make this even better is that CMA has the option to try and allocate
additional memory on the fly if its memory pool becomes empty. E.g. if the dma
pool is full, then it can try to do a kmalloc(..., GFP_KERNEL | __GFP_DMA).

This allows you to setup the dma and common regions with size 0. So allocating
from the dma region would effectively be the same as doing a kmalloc. Unless
the user sets up a dma area in the kernel parameters.

Obviously this is probably impossible if you need memory from specific memory
banks, so this is something that is not available for every region.

The nice thing about this is that it is very flexible for end users. For example,
most users of the ivtv driver wouldn't have to do anything since most of the time
it is able to assign the necessary buffers. But some users have multiple ivtv-based
capture boards in their PC, and then it can become harder to have ivtv obtain the
needed buffers. In that case they can preallocate the buffers by setting
dma/ivtv=500MB or something like that.

That would be a really nice feature...

Regards,

	Hans

-- 
Hans Verkuil - video4linux developer - sponsored by TANDBERG, part of Cisco

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
  2010-08-01 13:26               ` Hans Verkuil
  (?)
@ 2010-08-02 15:51                 ` Michał Nazarewicz
  -1 siblings, 0 replies; 57+ messages in thread
From: Michał Nazarewicz @ 2010-08-02 15:51 UTC (permalink / raw)
  To: Marek Szyprowski, Hans Verkuil
  Cc: 'Daniel Walker', 'Jonathan Corbet',
	Pawel Osciak, 'Mark Brown',
	linux-kernel, 'Hiremath Vaibhav',
	'FUJITA Tomonori', linux-mm, 'Kyungmin Park',
	'Zach Pfeffer',
	linux-arm-kernel, linux-media

> On Wednesday 28 July 2010 11:04:56 Marek Szyprowski wrote:
>> Let me introduce one more example. As you may know we have 3 video-processor
>> capture devices (Samsung FIMC) and a hardware codec (like Samsung MFC). FIMC
>> can capture video data from camera sensor and accelerate common video
>> processing tasks (like up/down scaling and color space conversion). Two FIMC
>> and MFC are require for things like HD video encoding or decoding with
>> online display/preview. This task require huge video buffers that are
>> usually allocated and freed at the same time. The third FIMC can be used for
>> gfx acceleration (color space conversion and scaling are quite common tasks
>> in GUI). This latter task usually introduces a lot of memory fragmentation,
>> as gfx surfaces are usually quite small (definitely smaller than HD frames
>> or 8MPix picture from camera). It would be really wise to get that third
>> FIMC device to use memory buffer that will be shared with 3D accelerator
>> (which has quite similar usage scenarios and suffers from similar memory
>> fragmentation).

On Sun, 01 Aug 2010 15:26:13 +0200, Hans Verkuil <hverkuil@xs4all.nl> wrote:
> OK, I understand. And I assume both gfx and 3D acceleration need to use a
> specific region? If they can use any type of memory, then this might be more
> appropriate for kmalloc and friends.

I've been thinking about providing a "fake" region with a "fake" allocator which
would allow in a generic way passing requests to kmalloc() and friends.  Such
regions could prove valuable for small allocations in things like 3D accelerator.

But as you've said, it's better to provide something small first and later add to
it so I'm postponing implementation of this feature.

Note, however, that 3D accelerator does not operate only on small chunks of memory.
A 1024x1024 texture is 1 Mipx.  RGB makes it 3MiB.  With mipmap it's 4MiB.  Even
512x512 texture can reach 1MiB this way.  It ma be impossible to allocate such
chunks with just a kmalloc().

>> We don't want to allocate X buffers of Y MB memory each on boot. Instead we
>> want to just reserve XX MB memory and then dynamically allocate buffers from
>> it. This enables us to perform the following 2 tasks:
>> 1. movie decoding in HD-quality (only one instance)
>> 2. two instances of SD-quality movie decoding and SD-quality move encoding
>>    (example: video conference)
>>
>> We know that these two use cases are exclusive, so they can use the same
>> reserved memory.

> When I said 'allocating X buffers of Y MB memory' I meant that you need to
> allocate a known amount of memory (X * Y MB in this case). So the boot args
> say e.g. dma=40MB and the driver just allocates X buffers from that region.

But the point is that driver does not allocate memory at boot time.  If video
codec would allocate memory at boot time no one else could use it even if the
codec is not used.  The idea is to make other devices use the memory when
codec is idle.  For instance, one could work on huge JPEG images and need
buffers for a hardware JPEG codec.

Or have I misunderstood your point?

> Switching to SD quality requires releasing those buffers and instead allocating
> a number of smaller buffers from the same region.

Our intention is that driver would allocate buffers only when needed so the buffers
would be freed when video codec driver is released.  So when the device is opened
(whatever that would mean for a particular device) it would allocate enough memory
for the requested task.

> For these use-cases the allocator can be very simple and probably covers most
> use-cases.

Yes, this is our experience.  The best-fit algorithm, even though simple, seem to
handle use cases tested on our system with little fragmentation.

> Anyway, I'm no expert on memory allocators and people on the linux-mm list are
> no doubt much more qualified to discuss this. My main concern is that of
> trying to add too much for a first release. It is simply easier to start simple
> and extend as needed. That makes it easier to be accepted in the mainline.

I'm trying to keep it as simple as possible :) still making it useful for us.

In particular we need a way to specify where different regions reside (different
memory banks, etc.) as well as specify which drivers should use which regions.
What's more, we need the notion of a "kind" of memory as one driver may need
memory buffers from different regions (ie. MFC needs to allocate buffers from
both banks).

>>>>>> +    2. CMA allows a run-time configuration of the memory regions it
>>>>>> +       will use to allocate chunks of memory from.  The set of memory
>>>>>> +       regions is given on command line so it can be easily changed
>>>>>> +       without the need for recompiling the kernel.
>>>>>> +
>>>>>> +       Each region has it's own size, alignment demand, a start
>>>>>> +       address (physical address where it should be placed) and an
>>>>>> +       allocator algorithm assigned to the region.
>>>>>> +
>>>>>> +       This means that there can be different algorithms running at
>>>>>> +       the same time, if different devices on the platform have
>>>>>> +       distinct memory usage characteristics and different algorithm
>>>>>> +       match those the best way.

>>>>> Seems overengineering to me. Just ensure that the code can be extended
>>>>> later to such hypothetical scenarios. They are hypothetical, right?

1. Everyone seem to hate the command line interface that was present in the
    first and second version of the patch.  As such, I've made it optional
    (via Kconfig option) in the third version (not posted yet), which
    unfortunately makes platform initialisation code longer and more
    complicated but hopefully more people will be happy. ;)

2. We need to specify size, alignment and start address so those are not
    hypothetical.

3. The algorithms are somehow hypothetical (we haven't tried using a different
    allocator as of you) but I think it's much easier to design the whole system
    with them in mind and implement them in the first version then later add code
    for them.

>>>>>> +    4. For greater flexibility and extensibility, the framework allows
>>>>>> +       device drivers to register private regions of reserved memory
>>>>>> +       which then may be used only by them.
>>>>>> +
>>>>>> +       As an effect, if a driver would not use the rest of the CMA
>>>>>> +       interface, it can still use CMA allocators and other
>>>>>> +       mechanisms.

>>>>> Why would you? Is there an actual driver that will need this?

>>>> This feature has been added after posting v1 of this rfc/patch. Jonathan
>>>> Corbet suggested in
>>>> <http://article.gmane.org/gmane.linux.kernel.mm/50689>
>>>> that viafb driver might register its own private memory and use cma just
>>>> as an allocator.

I may also add that adding this actually made me refactor the code a bit
making it more readable in the end I think. :)

>>> What I have seen in practice is that these drivers just
>>> need X amount of contiguous memory on boot. Having just a single region (as
>>> it will be for most systems) to carve the buffers from is just as efficient
>>> if not more than creating separate regions for each driver. Only if you
>>> start freeing and reallocating memory later on will you get into trouble.
>>>
>>> But if you do that, then you are trying to duplicate the behavior of the
>>> normal allocators in my opinion. I really don't think we want to go there.

Please note that kmalloc() was not designed to handle big chunks of memory
and vmalloc() does not give a contiguous memory blocks.  This is usually
reason enough for a custom allocator that operates on a big region of memory
reserved at boot time.

For instance, if some driver operates on buffers that are between 512 KiB and 4 MiB
(as I've shown above such sizes could well be required for textures) it needs to
reserve some big region of contiguous memory and then manage it by itself.

One of CMA's goals is to give a common API for drivers that need such allocators.

>>>>>> +       4a. Early in boot process, device drivers can also request the
>>>>>> +           CMA framework to a reserve a region of memory for them
>>>>>> +           which then will be used as a private region.
>>>>>> +
>>>>>> +           This way, drivers do not need to directly call bootmem,
>>>>>> +           memblock or similar early allocator but merely register an
>>>>>> +           early region and the framework will handle the rest
>>>>>> +           including choosing the right early allocator.

>>>>> The whole concept of private regions seems unnecessary to me.

This particular thing was suggested by someone I think.  Or maybe someone wrote
something that make me think about it?  Someone suggested that drivers may want
to just grab some region of memory and have it for themselves.  Even though I'd
rather see them using the other set of CMA APIs but nonetheless it may prove
useful for someone.

This is especially true for devices with their own memory which only their
driver should have access to.  I admit that it is a bit hypothetical though.

At any rate, with a changes made between the first and the second (this one)
versions of the patch private regions were actually trivial to add.  This
merely mimics the way regions are reserved at boot time so the code is
simply identical to what platform initialisation code may use.  The only
thing that make private regions special is the fact that they have no name.

>>>>> It looks to me as if you tried to think of all possible hypothetical
>>>>> situations and write a framework for that.

Not exactly...  The first version of the patch provided fewer features and
this was mostly what we needed on our platform with maybe a few features
that weren't a must.

After posting we received some comments and suggestions which made my change
the code a bit making it more flexible and dynamic at the same time letting
more features in.

> Regarding regions and shared and per-driver buffers: I've been thinking about
> this a bit more and I have a proposal of my own.
>
> There are two different aspects to this: first there is the hardware aspect: if
> the hardware needs memory from specific memory banks or with specific requirements
> (e.g. DMAable), then those regions should be setup in the platform code. There you
> know the memory sizes/alignments/etc. since that is hw dependent. The other reason
> is that drivers need to be able to tell CMA that they need to allocate from such
> regions.  You can't have a driver refer to a region that is specified through
> kernel parameters, that would create a very ugly dependency.
>
> The other aspect is how to setup buffers. A global buffer is simply setup by
> assigning a size to the region: "banka=20MB". Unless specified otherwise any
> driver that needs memory from banka will use that global banka buffer.
>
> Alternatively, you can set aside memory from a region specifically for drivers:
> banka/foo=30MB. This allocated 30 MB from region banka specifically for driver foo.
>
> You can also share that with another driver:
>
> banka/foo,bar=30MB
>
> Now this 30 MB buffer is shared between drivers foo and bar.

Let me rephrase it to see if I got it correct:

You propose that platform will define what types of memory it has.  For instance
banka for a the first bank, bankb for the second memory bank, dma for DMA-able
memory, etc.  Those definitions would be merely informative and by themselves
they would not reserve any memory.

Later, it would be possible to specify regions of memory of those types.  For
instance:

   banka=20M; banka/foo,bar=30M

would register two regions in the memory type "banka" such that the first is 20 MiB
and used by all drivers expect for driver foo and bar which would use the second
region of 30 MiB?

> The nice thing about this is that the driver will still only refer to region
> banka as setup by the platform code.

So the driver would request a memory type "banka" and then get a chunk from one of
the abovementioned regions?

I somehow like the simplicity of that but I see some disadvantages:

1. Imagine a video decoder which for best performance should operate on some buffers
    from the first and some buffers from the second bank.  However, if the buffers are
    from the incorrect bank it will still work, only slower.  In such situations you
    cannot specify that when driver foo requests memory type "banka" then it should
    first try memory type "banka" but if allocation failed there try "bankb".

2. What if the device handled by the above driver were run on a platform with only
    one memory bank?  The driver would still refer to "banka" and "bankb" but there
    would be no such types in the system.

3. What if there were one driver, initially written for platform X which used names
    "banka" and "bankb", and another driver, initially written for platform Y which
    used names "bank1" and "bank2".  How would you make them work on a single platform
    with two memory banks?

4. This is hypothetical, but the "kind" defined by CMA could be used to specify
    characteristics that are not hardware dependent.  For instance some driver
    could use kind "bulk" for some big, several MiB buffers and "control" for
    small less then MiB buffers.  Regions for those kinds could be of the same
    type of memory but it could be valuable splitting those to two regions to
    minimise fragmentation.

> And in the more general case you can have two standard regions: dma and common.
> So drivers can rely on the presence of a dma region when allocating buffers.

I think that driver should not care about or know region names at all.

> What would make this even better is that CMA has the option to try and allocate
> additional memory on the fly if its memory pool becomes empty. E.g. if the dma
> pool is full, then it can try to do a kmalloc(..., GFP_KERNEL | __GFP_DMA).

As I've said somewhere above, I was thinking about something like it.

> This allows you to setup the dma and common regions with size 0. So allocating
> from the dma region would effectively be the same as doing a kmalloc. Unless
> the user sets up a dma area in the kernel parameters.
>
> Obviously this is probably impossible if you need memory from specific memory
> banks, so this is something that is not available for every region.
>
> The nice thing about this is that it is very flexible for end users. For example,
> most users of the ivtv driver wouldn't have to do anything since most of the time
> it is able to assign the necessary buffers. But some users have multiple ivtv-based
> capture boards in their PC, and then it can become harder to have ivtv obtain the
> needed buffers. In that case they can preallocate the buffers by setting
> dma/ivtv=500MB or something like that.
>
> That would be a really nice feature...

I think the main difference between your proposal and what is in CMA is that you
propose that platform define types of memory and later on user will be able to
define regions of given type of memory.  This means that drivers would have to
be aware of the names of the types and specify the type name witch each allocation.

The CMA solution however, lets drivers define their own kinds of memory and later
on platform initialisation code map drivers with their kinds to regions.

Have I got it right?

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Michał "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-08-02 15:51                 ` Michał Nazarewicz
  0 siblings, 0 replies; 57+ messages in thread
From: Michał Nazarewicz @ 2010-08-02 15:51 UTC (permalink / raw)
  To: Marek Szyprowski, Hans Verkuil
  Cc: 'Daniel Walker', 'Jonathan Corbet',
	Pawel Osciak, 'Mark Brown',
	linux-kernel, 'Hiremath Vaibhav',
	'FUJITA Tomonori', linux-mm, 'Kyungmin Park',
	'Zach Pfeffer',
	linux-arm-kernel, linux-media

> On Wednesday 28 July 2010 11:04:56 Marek Szyprowski wrote:
>> Let me introduce one more example. As you may know we have 3 video-processor
>> capture devices (Samsung FIMC) and a hardware codec (like Samsung MFC). FIMC
>> can capture video data from camera sensor and accelerate common video
>> processing tasks (like up/down scaling and color space conversion). Two FIMC
>> and MFC are require for things like HD video encoding or decoding with
>> online display/preview. This task require huge video buffers that are
>> usually allocated and freed at the same time. The third FIMC can be used for
>> gfx acceleration (color space conversion and scaling are quite common tasks
>> in GUI). This latter task usually introduces a lot of memory fragmentation,
>> as gfx surfaces are usually quite small (definitely smaller than HD frames
>> or 8MPix picture from camera). It would be really wise to get that third
>> FIMC device to use memory buffer that will be shared with 3D accelerator
>> (which has quite similar usage scenarios and suffers from similar memory
>> fragmentation).

On Sun, 01 Aug 2010 15:26:13 +0200, Hans Verkuil <hverkuil@xs4all.nl> wrote:
> OK, I understand. And I assume both gfx and 3D acceleration need to use a
> specific region? If they can use any type of memory, then this might be more
> appropriate for kmalloc and friends.

I've been thinking about providing a "fake" region with a "fake" allocator which
would allow in a generic way passing requests to kmalloc() and friends.  Such
regions could prove valuable for small allocations in things like 3D accelerator.

But as you've said, it's better to provide something small first and later add to
it so I'm postponing implementation of this feature.

Note, however, that 3D accelerator does not operate only on small chunks of memory.
A 1024x1024 texture is 1 Mipx.  RGB makes it 3MiB.  With mipmap it's 4MiB.  Even
512x512 texture can reach 1MiB this way.  It ma be impossible to allocate such
chunks with just a kmalloc().

>> We don't want to allocate X buffers of Y MB memory each on boot. Instead we
>> want to just reserve XX MB memory and then dynamically allocate buffers from
>> it. This enables us to perform the following 2 tasks:
>> 1. movie decoding in HD-quality (only one instance)
>> 2. two instances of SD-quality movie decoding and SD-quality move encoding
>>    (example: video conference)
>>
>> We know that these two use cases are exclusive, so they can use the same
>> reserved memory.

> When I said 'allocating X buffers of Y MB memory' I meant that you need to
> allocate a known amount of memory (X * Y MB in this case). So the boot args
> say e.g. dma=40MB and the driver just allocates X buffers from that region.

But the point is that driver does not allocate memory at boot time.  If video
codec would allocate memory at boot time no one else could use it even if the
codec is not used.  The idea is to make other devices use the memory when
codec is idle.  For instance, one could work on huge JPEG images and need
buffers for a hardware JPEG codec.

Or have I misunderstood your point?

> Switching to SD quality requires releasing those buffers and instead allocating
> a number of smaller buffers from the same region.

Our intention is that driver would allocate buffers only when needed so the buffers
would be freed when video codec driver is released.  So when the device is opened
(whatever that would mean for a particular device) it would allocate enough memory
for the requested task.

> For these use-cases the allocator can be very simple and probably covers most
> use-cases.

Yes, this is our experience.  The best-fit algorithm, even though simple, seem to
handle use cases tested on our system with little fragmentation.

> Anyway, I'm no expert on memory allocators and people on the linux-mm list are
> no doubt much more qualified to discuss this. My main concern is that of
> trying to add too much for a first release. It is simply easier to start simple
> and extend as needed. That makes it easier to be accepted in the mainline.

I'm trying to keep it as simple as possible :) still making it useful for us.

In particular we need a way to specify where different regions reside (different
memory banks, etc.) as well as specify which drivers should use which regions.
What's more, we need the notion of a "kind" of memory as one driver may need
memory buffers from different regions (ie. MFC needs to allocate buffers from
both banks).

>>>>>> +    2. CMA allows a run-time configuration of the memory regions it
>>>>>> +       will use to allocate chunks of memory from.  The set of memory
>>>>>> +       regions is given on command line so it can be easily changed
>>>>>> +       without the need for recompiling the kernel.
>>>>>> +
>>>>>> +       Each region has it's own size, alignment demand, a start
>>>>>> +       address (physical address where it should be placed) and an
>>>>>> +       allocator algorithm assigned to the region.
>>>>>> +
>>>>>> +       This means that there can be different algorithms running at
>>>>>> +       the same time, if different devices on the platform have
>>>>>> +       distinct memory usage characteristics and different algorithm
>>>>>> +       match those the best way.

>>>>> Seems overengineering to me. Just ensure that the code can be extended
>>>>> later to such hypothetical scenarios. They are hypothetical, right?

1. Everyone seem to hate the command line interface that was present in the
    first and second version of the patch.  As such, I've made it optional
    (via Kconfig option) in the third version (not posted yet), which
    unfortunately makes platform initialisation code longer and more
    complicated but hopefully more people will be happy. ;)

2. We need to specify size, alignment and start address so those are not
    hypothetical.

3. The algorithms are somehow hypothetical (we haven't tried using a different
    allocator as of you) but I think it's much easier to design the whole system
    with them in mind and implement them in the first version then later add code
    for them.

>>>>>> +    4. For greater flexibility and extensibility, the framework allows
>>>>>> +       device drivers to register private regions of reserved memory
>>>>>> +       which then may be used only by them.
>>>>>> +
>>>>>> +       As an effect, if a driver would not use the rest of the CMA
>>>>>> +       interface, it can still use CMA allocators and other
>>>>>> +       mechanisms.

>>>>> Why would you? Is there an actual driver that will need this?

>>>> This feature has been added after posting v1 of this rfc/patch. Jonathan
>>>> Corbet suggested in
>>>> <http://article.gmane.org/gmane.linux.kernel.mm/50689>
>>>> that viafb driver might register its own private memory and use cma just
>>>> as an allocator.

I may also add that adding this actually made me refactor the code a bit
making it more readable in the end I think. :)

>>> What I have seen in practice is that these drivers just
>>> need X amount of contiguous memory on boot. Having just a single region (as
>>> it will be for most systems) to carve the buffers from is just as efficient
>>> if not more than creating separate regions for each driver. Only if you
>>> start freeing and reallocating memory later on will you get into trouble.
>>>
>>> But if you do that, then you are trying to duplicate the behavior of the
>>> normal allocators in my opinion. I really don't think we want to go there.

Please note that kmalloc() was not designed to handle big chunks of memory
and vmalloc() does not give a contiguous memory blocks.  This is usually
reason enough for a custom allocator that operates on a big region of memory
reserved at boot time.

For instance, if some driver operates on buffers that are between 512 KiB and 4 MiB
(as I've shown above such sizes could well be required for textures) it needs to
reserve some big region of contiguous memory and then manage it by itself.

One of CMA's goals is to give a common API for drivers that need such allocators.

>>>>>> +       4a. Early in boot process, device drivers can also request the
>>>>>> +           CMA framework to a reserve a region of memory for them
>>>>>> +           which then will be used as a private region.
>>>>>> +
>>>>>> +           This way, drivers do not need to directly call bootmem,
>>>>>> +           memblock or similar early allocator but merely register an
>>>>>> +           early region and the framework will handle the rest
>>>>>> +           including choosing the right early allocator.

>>>>> The whole concept of private regions seems unnecessary to me.

This particular thing was suggested by someone I think.  Or maybe someone wrote
something that make me think about it?  Someone suggested that drivers may want
to just grab some region of memory and have it for themselves.  Even though I'd
rather see them using the other set of CMA APIs but nonetheless it may prove
useful for someone.

This is especially true for devices with their own memory which only their
driver should have access to.  I admit that it is a bit hypothetical though.

At any rate, with a changes made between the first and the second (this one)
versions of the patch private regions were actually trivial to add.  This
merely mimics the way regions are reserved at boot time so the code is
simply identical to what platform initialisation code may use.  The only
thing that make private regions special is the fact that they have no name.

>>>>> It looks to me as if you tried to think of all possible hypothetical
>>>>> situations and write a framework for that.

Not exactly...  The first version of the patch provided fewer features and
this was mostly what we needed on our platform with maybe a few features
that weren't a must.

After posting we received some comments and suggestions which made my change
the code a bit making it more flexible and dynamic at the same time letting
more features in.

> Regarding regions and shared and per-driver buffers: I've been thinking about
> this a bit more and I have a proposal of my own.
>
> There are two different aspects to this: first there is the hardware aspect: if
> the hardware needs memory from specific memory banks or with specific requirements
> (e.g. DMAable), then those regions should be setup in the platform code. There you
> know the memory sizes/alignments/etc. since that is hw dependent. The other reason
> is that drivers need to be able to tell CMA that they need to allocate from such
> regions.  You can't have a driver refer to a region that is specified through
> kernel parameters, that would create a very ugly dependency.
>
> The other aspect is how to setup buffers. A global buffer is simply setup by
> assigning a size to the region: "banka=20MB". Unless specified otherwise any
> driver that needs memory from banka will use that global banka buffer.
>
> Alternatively, you can set aside memory from a region specifically for drivers:
> banka/foo=30MB. This allocated 30 MB from region banka specifically for driver foo.
>
> You can also share that with another driver:
>
> banka/foo,bar=30MB
>
> Now this 30 MB buffer is shared between drivers foo and bar.

Let me rephrase it to see if I got it correct:

You propose that platform will define what types of memory it has.  For instance
banka for a the first bank, bankb for the second memory bank, dma for DMA-able
memory, etc.  Those definitions would be merely informative and by themselves
they would not reserve any memory.

Later, it would be possible to specify regions of memory of those types.  For
instance:

   banka=20M; banka/foo,bar=30M

would register two regions in the memory type "banka" such that the first is 20 MiB
and used by all drivers expect for driver foo and bar which would use the second
region of 30 MiB?

> The nice thing about this is that the driver will still only refer to region
> banka as setup by the platform code.

So the driver would request a memory type "banka" and then get a chunk from one of
the abovementioned regions?

I somehow like the simplicity of that but I see some disadvantages:

1. Imagine a video decoder which for best performance should operate on some buffers
    from the first and some buffers from the second bank.  However, if the buffers are
    from the incorrect bank it will still work, only slower.  In such situations you
    cannot specify that when driver foo requests memory type "banka" then it should
    first try memory type "banka" but if allocation failed there try "bankb".

2. What if the device handled by the above driver were run on a platform with only
    one memory bank?  The driver would still refer to "banka" and "bankb" but there
    would be no such types in the system.

3. What if there were one driver, initially written for platform X which used names
    "banka" and "bankb", and another driver, initially written for platform Y which
    used names "bank1" and "bank2".  How would you make them work on a single platform
    with two memory banks?

4. This is hypothetical, but the "kind" defined by CMA could be used to specify
    characteristics that are not hardware dependent.  For instance some driver
    could use kind "bulk" for some big, several MiB buffers and "control" for
    small less then MiB buffers.  Regions for those kinds could be of the same
    type of memory but it could be valuable splitting those to two regions to
    minimise fragmentation.

> And in the more general case you can have two standard regions: dma and common.
> So drivers can rely on the presence of a dma region when allocating buffers.

I think that driver should not care about or know region names at all.

> What would make this even better is that CMA has the option to try and allocate
> additional memory on the fly if its memory pool becomes empty. E.g. if the dma
> pool is full, then it can try to do a kmalloc(..., GFP_KERNEL | __GFP_DMA).

As I've said somewhere above, I was thinking about something like it.

> This allows you to setup the dma and common regions with size 0. So allocating
> from the dma region would effectively be the same as doing a kmalloc. Unless
> the user sets up a dma area in the kernel parameters.
>
> Obviously this is probably impossible if you need memory from specific memory
> banks, so this is something that is not available for every region.
>
> The nice thing about this is that it is very flexible for end users. For example,
> most users of the ivtv driver wouldn't have to do anything since most of the time
> it is able to assign the necessary buffers. But some users have multiple ivtv-based
> capture boards in their PC, and then it can become harder to have ivtv obtain the
> needed buffers. In that case they can preallocate the buffers by setting
> dma/ivtv=500MB or something like that.
>
> That would be a really nice feature...

I think the main difference between your proposal and what is in CMA is that you
propose that platform define types of memory and later on user will be able to
define regions of given type of memory.  This means that drivers would have to
be aware of the names of the types and specify the type name witch each allocation.

The CMA solution however, lets drivers define their own kinds of memory and later
on platform initialisation code map drivers with their kinds to regions.

Have I got it right?

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Michał "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-08-02 15:51                 ` Michał Nazarewicz
  0 siblings, 0 replies; 57+ messages in thread
From: Michał Nazarewicz @ 2010-08-02 15:51 UTC (permalink / raw)
  To: linux-arm-kernel

> On Wednesday 28 July 2010 11:04:56 Marek Szyprowski wrote:
>> Let me introduce one more example. As you may know we have 3 video-processor
>> capture devices (Samsung FIMC) and a hardware codec (like Samsung MFC). FIMC
>> can capture video data from camera sensor and accelerate common video
>> processing tasks (like up/down scaling and color space conversion). Two FIMC
>> and MFC are require for things like HD video encoding or decoding with
>> online display/preview. This task require huge video buffers that are
>> usually allocated and freed at the same time. The third FIMC can be used for
>> gfx acceleration (color space conversion and scaling are quite common tasks
>> in GUI). This latter task usually introduces a lot of memory fragmentation,
>> as gfx surfaces are usually quite small (definitely smaller than HD frames
>> or 8MPix picture from camera). It would be really wise to get that third
>> FIMC device to use memory buffer that will be shared with 3D accelerator
>> (which has quite similar usage scenarios and suffers from similar memory
>> fragmentation).

On Sun, 01 Aug 2010 15:26:13 +0200, Hans Verkuil <hverkuil@xs4all.nl> wrote:
> OK, I understand. And I assume both gfx and 3D acceleration need to use a
> specific region? If they can use any type of memory, then this might be more
> appropriate for kmalloc and friends.

I've been thinking about providing a "fake" region with a "fake" allocator which
would allow in a generic way passing requests to kmalloc() and friends.  Such
regions could prove valuable for small allocations in things like 3D accelerator.

But as you've said, it's better to provide something small first and later add to
it so I'm postponing implementation of this feature.

Note, however, that 3D accelerator does not operate only on small chunks of memory.
A 1024x1024 texture is 1 Mipx.  RGB makes it 3MiB.  With mipmap it's 4MiB.  Even
512x512 texture can reach 1MiB this way.  It ma be impossible to allocate such
chunks with just a kmalloc().

>> We don't want to allocate X buffers of Y MB memory each on boot. Instead we
>> want to just reserve XX MB memory and then dynamically allocate buffers from
>> it. This enables us to perform the following 2 tasks:
>> 1. movie decoding in HD-quality (only one instance)
>> 2. two instances of SD-quality movie decoding and SD-quality move encoding
>>    (example: video conference)
>>
>> We know that these two use cases are exclusive, so they can use the same
>> reserved memory.

> When I said 'allocating X buffers of Y MB memory' I meant that you need to
> allocate a known amount of memory (X * Y MB in this case). So the boot args
> say e.g. dma=40MB and the driver just allocates X buffers from that region.

But the point is that driver does not allocate memory at boot time.  If video
codec would allocate memory at boot time no one else could use it even if the
codec is not used.  The idea is to make other devices use the memory when
codec is idle.  For instance, one could work on huge JPEG images and need
buffers for a hardware JPEG codec.

Or have I misunderstood your point?

> Switching to SD quality requires releasing those buffers and instead allocating
> a number of smaller buffers from the same region.

Our intention is that driver would allocate buffers only when needed so the buffers
would be freed when video codec driver is released.  So when the device is opened
(whatever that would mean for a particular device) it would allocate enough memory
for the requested task.

> For these use-cases the allocator can be very simple and probably covers most
> use-cases.

Yes, this is our experience.  The best-fit algorithm, even though simple, seem to
handle use cases tested on our system with little fragmentation.

> Anyway, I'm no expert on memory allocators and people on the linux-mm list are
> no doubt much more qualified to discuss this. My main concern is that of
> trying to add too much for a first release. It is simply easier to start simple
> and extend as needed. That makes it easier to be accepted in the mainline.

I'm trying to keep it as simple as possible :) still making it useful for us.

In particular we need a way to specify where different regions reside (different
memory banks, etc.) as well as specify which drivers should use which regions.
What's more, we need the notion of a "kind" of memory as one driver may need
memory buffers from different regions (ie. MFC needs to allocate buffers from
both banks).

>>>>>> +    2. CMA allows a run-time configuration of the memory regions it
>>>>>> +       will use to allocate chunks of memory from.  The set of memory
>>>>>> +       regions is given on command line so it can be easily changed
>>>>>> +       without the need for recompiling the kernel.
>>>>>> +
>>>>>> +       Each region has it's own size, alignment demand, a start
>>>>>> +       address (physical address where it should be placed) and an
>>>>>> +       allocator algorithm assigned to the region.
>>>>>> +
>>>>>> +       This means that there can be different algorithms running at
>>>>>> +       the same time, if different devices on the platform have
>>>>>> +       distinct memory usage characteristics and different algorithm
>>>>>> +       match those the best way.

>>>>> Seems overengineering to me. Just ensure that the code can be extended
>>>>> later to such hypothetical scenarios. They are hypothetical, right?

1. Everyone seem to hate the command line interface that was present in the
    first and second version of the patch.  As such, I've made it optional
    (via Kconfig option) in the third version (not posted yet), which
    unfortunately makes platform initialisation code longer and more
    complicated but hopefully more people will be happy. ;)

2. We need to specify size, alignment and start address so those are not
    hypothetical.

3. The algorithms are somehow hypothetical (we haven't tried using a different
    allocator as of you) but I think it's much easier to design the whole system
    with them in mind and implement them in the first version then later add code
    for them.

>>>>>> +    4. For greater flexibility and extensibility, the framework allows
>>>>>> +       device drivers to register private regions of reserved memory
>>>>>> +       which then may be used only by them.
>>>>>> +
>>>>>> +       As an effect, if a driver would not use the rest of the CMA
>>>>>> +       interface, it can still use CMA allocators and other
>>>>>> +       mechanisms.

>>>>> Why would you? Is there an actual driver that will need this?

>>>> This feature has been added after posting v1 of this rfc/patch. Jonathan
>>>> Corbet suggested in
>>>> <http://article.gmane.org/gmane.linux.kernel.mm/50689>
>>>> that viafb driver might register its own private memory and use cma just
>>>> as an allocator.

I may also add that adding this actually made me refactor the code a bit
making it more readable in the end I think. :)

>>> What I have seen in practice is that these drivers just
>>> need X amount of contiguous memory on boot. Having just a single region (as
>>> it will be for most systems) to carve the buffers from is just as efficient
>>> if not more than creating separate regions for each driver. Only if you
>>> start freeing and reallocating memory later on will you get into trouble.
>>>
>>> But if you do that, then you are trying to duplicate the behavior of the
>>> normal allocators in my opinion. I really don't think we want to go there.

Please note that kmalloc() was not designed to handle big chunks of memory
and vmalloc() does not give a contiguous memory blocks.  This is usually
reason enough for a custom allocator that operates on a big region of memory
reserved at boot time.

For instance, if some driver operates on buffers that are between 512 KiB and 4 MiB
(as I've shown above such sizes could well be required for textures) it needs to
reserve some big region of contiguous memory and then manage it by itself.

One of CMA's goals is to give a common API for drivers that need such allocators.

>>>>>> +       4a. Early in boot process, device drivers can also request the
>>>>>> +           CMA framework to a reserve a region of memory for them
>>>>>> +           which then will be used as a private region.
>>>>>> +
>>>>>> +           This way, drivers do not need to directly call bootmem,
>>>>>> +           memblock or similar early allocator but merely register an
>>>>>> +           early region and the framework will handle the rest
>>>>>> +           including choosing the right early allocator.

>>>>> The whole concept of private regions seems unnecessary to me.

This particular thing was suggested by someone I think.  Or maybe someone wrote
something that make me think about it?  Someone suggested that drivers may want
to just grab some region of memory and have it for themselves.  Even though I'd
rather see them using the other set of CMA APIs but nonetheless it may prove
useful for someone.

This is especially true for devices with their own memory which only their
driver should have access to.  I admit that it is a bit hypothetical though.

At any rate, with a changes made between the first and the second (this one)
versions of the patch private regions were actually trivial to add.  This
merely mimics the way regions are reserved at boot time so the code is
simply identical to what platform initialisation code may use.  The only
thing that make private regions special is the fact that they have no name.

>>>>> It looks to me as if you tried to think of all possible hypothetical
>>>>> situations and write a framework for that.

Not exactly...  The first version of the patch provided fewer features and
this was mostly what we needed on our platform with maybe a few features
that weren't a must.

After posting we received some comments and suggestions which made my change
the code a bit making it more flexible and dynamic at the same time letting
more features in.

> Regarding regions and shared and per-driver buffers: I've been thinking about
> this a bit more and I have a proposal of my own.
>
> There are two different aspects to this: first there is the hardware aspect: if
> the hardware needs memory from specific memory banks or with specific requirements
> (e.g. DMAable), then those regions should be setup in the platform code. There you
> know the memory sizes/alignments/etc. since that is hw dependent. The other reason
> is that drivers need to be able to tell CMA that they need to allocate from such
> regions.  You can't have a driver refer to a region that is specified through
> kernel parameters, that would create a very ugly dependency.
>
> The other aspect is how to setup buffers. A global buffer is simply setup by
> assigning a size to the region: "banka=20MB". Unless specified otherwise any
> driver that needs memory from banka will use that global banka buffer.
>
> Alternatively, you can set aside memory from a region specifically for drivers:
> banka/foo=30MB. This allocated 30 MB from region banka specifically for driver foo.
>
> You can also share that with another driver:
>
> banka/foo,bar=30MB
>
> Now this 30 MB buffer is shared between drivers foo and bar.

Let me rephrase it to see if I got it correct:

You propose that platform will define what types of memory it has.  For instance
banka for a the first bank, bankb for the second memory bank, dma for DMA-able
memory, etc.  Those definitions would be merely informative and by themselves
they would not reserve any memory.

Later, it would be possible to specify regions of memory of those types.  For
instance:

   banka=20M; banka/foo,bar=30M

would register two regions in the memory type "banka" such that the first is 20 MiB
and used by all drivers expect for driver foo and bar which would use the second
region of 30 MiB?

> The nice thing about this is that the driver will still only refer to region
> banka as setup by the platform code.

So the driver would request a memory type "banka" and then get a chunk from one of
the abovementioned regions?

I somehow like the simplicity of that but I see some disadvantages:

1. Imagine a video decoder which for best performance should operate on some buffers
    from the first and some buffers from the second bank.  However, if the buffers are
    from the incorrect bank it will still work, only slower.  In such situations you
    cannot specify that when driver foo requests memory type "banka" then it should
    first try memory type "banka" but if allocation failed there try "bankb".

2. What if the device handled by the above driver were run on a platform with only
    one memory bank?  The driver would still refer to "banka" and "bankb" but there
    would be no such types in the system.

3. What if there were one driver, initially written for platform X which used names
    "banka" and "bankb", and another driver, initially written for platform Y which
    used names "bank1" and "bank2".  How would you make them work on a single platform
    with two memory banks?

4. This is hypothetical, but the "kind" defined by CMA could be used to specify
    characteristics that are not hardware dependent.  For instance some driver
    could use kind "bulk" for some big, several MiB buffers and "control" for
    small less then MiB buffers.  Regions for those kinds could be of the same
    type of memory but it could be valuable splitting those to two regions to
    minimise fragmentation.

> And in the more general case you can have two standard regions: dma and common.
> So drivers can rely on the presence of a dma region when allocating buffers.

I think that driver should not care about or know region names at all.

> What would make this even better is that CMA has the option to try and allocate
> additional memory on the fly if its memory pool becomes empty. E.g. if the dma
> pool is full, then it can try to do a kmalloc(..., GFP_KERNEL | __GFP_DMA).

As I've said somewhere above, I was thinking about something like it.

> This allows you to setup the dma and common regions with size 0. So allocating
> from the dma region would effectively be the same as doing a kmalloc. Unless
> the user sets up a dma area in the kernel parameters.
>
> Obviously this is probably impossible if you need memory from specific memory
> banks, so this is something that is not available for every region.
>
> The nice thing about this is that it is very flexible for end users. For example,
> most users of the ivtv driver wouldn't have to do anything since most of the time
> it is able to assign the necessary buffers. But some users have multiple ivtv-based
> capture boards in their PC, and then it can become harder to have ivtv obtain the
> needed buffers. In that case they can preallocate the buffers by setting
> dma/ivtv=500MB or something like that.
>
> That would be a really nice feature...

I think the main difference between your proposal and what is in CMA is that you
propose that platform define types of memory and later on user will be able to
define regions of given type of memory.  This means that drivers would have to
be aware of the names of the types and specify the type name witch each allocation.

The CMA solution however, lets drivers define their own kinds of memory and later
on platform initialisation code map drivers with their kinds to regions.

Have I got it right?

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Micha? "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
  2010-08-02 15:51                 ` Michał Nazarewicz
  (?)
@ 2010-08-03  7:19                   ` Hans Verkuil
  -1 siblings, 0 replies; 57+ messages in thread
From: Hans Verkuil @ 2010-08-03  7:19 UTC (permalink / raw)
  To: Michał Nazarewicz
  Cc: Marek Szyprowski, 'Daniel Walker',
	'Jonathan Corbet', Pawel Osciak, 'Mark Brown',
	linux-kernel, 'Hiremath Vaibhav',
	'FUJITA Tomonori', linux-mm, 'Kyungmin Park',
	'Zach Pfeffer',
	linux-arm-kernel, linux-media

On Monday 02 August 2010 17:51:39 Michał Nazarewicz wrote:

<snip>

> >> We don't want to allocate X buffers of Y MB memory each on boot. Instead we
> >> want to just reserve XX MB memory and then dynamically allocate buffers from
> >> it. This enables us to perform the following 2 tasks:
> >> 1. movie decoding in HD-quality (only one instance)
> >> 2. two instances of SD-quality movie decoding and SD-quality move encoding
> >>    (example: video conference)
> >>
> >> We know that these two use cases are exclusive, so they can use the same
> >> reserved memory.
> 
> > When I said 'allocating X buffers of Y MB memory' I meant that you need to
> > allocate a known amount of memory (X * Y MB in this case). So the boot args
> > say e.g. dma=40MB and the driver just allocates X buffers from that region.
> 
> But the point is that driver does not allocate memory at boot time.  If video
> codec would allocate memory at boot time no one else could use it even if the
> codec is not used.  The idea is to make other devices use the memory when
> codec is idle.  For instance, one could work on huge JPEG images and need
> buffers for a hardware JPEG codec.
> 
> Or have I misunderstood your point?

I think we are talking about the same thing. A region like dma=40MB would be
shared by all drivers that want to allocate from it.
 
> > Switching to SD quality requires releasing those buffers and instead allocating
> > a number of smaller buffers from the same region.
> 
> Our intention is that driver would allocate buffers only when needed so the buffers
> would be freed when video codec driver is released.  So when the device is opened
> (whatever that would mean for a particular device) it would allocate enough memory
> for the requested task.

Right.

> > For these use-cases the allocator can be very simple and probably covers most
> > use-cases.
> 
> Yes, this is our experience.  The best-fit algorithm, even though simple, seem to
> handle use cases tested on our system with little fragmentation.

That's what I expected as well.

> > Anyway, I'm no expert on memory allocators and people on the linux-mm list are
> > no doubt much more qualified to discuss this. My main concern is that of
> > trying to add too much for a first release. It is simply easier to start simple
> > and extend as needed. That makes it easier to be accepted in the mainline.
> 
> I'm trying to keep it as simple as possible :) still making it useful for us.
> 
> In particular we need a way to specify where different regions reside (different
> memory banks, etc.) as well as specify which drivers should use which regions.
> What's more, we need the notion of a "kind" of memory as one driver may need
> memory buffers from different regions (ie. MFC needs to allocate buffers from
> both banks).
> 
> >>>>>> +    2. CMA allows a run-time configuration of the memory regions it
> >>>>>> +       will use to allocate chunks of memory from.  The set of memory
> >>>>>> +       regions is given on command line so it can be easily changed
> >>>>>> +       without the need for recompiling the kernel.
> >>>>>> +
> >>>>>> +       Each region has it's own size, alignment demand, a start
> >>>>>> +       address (physical address where it should be placed) and an
> >>>>>> +       allocator algorithm assigned to the region.
> >>>>>> +
> >>>>>> +       This means that there can be different algorithms running at
> >>>>>> +       the same time, if different devices on the platform have
> >>>>>> +       distinct memory usage characteristics and different algorithm
> >>>>>> +       match those the best way.
> 
> >>>>> Seems overengineering to me. Just ensure that the code can be extended
> >>>>> later to such hypothetical scenarios. They are hypothetical, right?
> 
> 1. Everyone seem to hate the command line interface that was present in the
>     first and second version of the patch.  As such, I've made it optional
>     (via Kconfig option) in the third version (not posted yet), which
>     unfortunately makes platform initialisation code longer and more
>     complicated but hopefully more people will be happy. ;)

I strongly recommend that it is simple dropped from the first version. That
will increase the chances of getting it merged. And once merged, interfaces
like this can be discussed at leisure.
 
> 2. We need to specify size, alignment and start address so those are not
>     hypothetical.

Agreed. But this is platform code, you should not have to pass this info
through boot args.
 
> 3. The algorithms are somehow hypothetical (we haven't tried using a different
>     allocator as of you) but I think it's much easier to design the whole system
>     with them in mind and implement them in the first version then later add code
>     for them.

I agree with that as long as the extra code needed to do so it within limits.
I've seen too often that people design for a future that never happens. That
leads to code that is never used and will make it hard to future generations
of developers to figure out what the purpose was of that code.

<snip>

> > Regarding regions and shared and per-driver buffers: I've been thinking about
> > this a bit more and I have a proposal of my own.
> >
> > There are two different aspects to this: first there is the hardware aspect: if
> > the hardware needs memory from specific memory banks or with specific requirements
> > (e.g. DMAable), then those regions should be setup in the platform code. There you
> > know the memory sizes/alignments/etc. since that is hw dependent. The other reason
> > is that drivers need to be able to tell CMA that they need to allocate from such
> > regions.  You can't have a driver refer to a region that is specified through
> > kernel parameters, that would create a very ugly dependency.
> >
> > The other aspect is how to setup buffers. A global buffer is simply setup by
> > assigning a size to the region: "banka=20MB". Unless specified otherwise any
> > driver that needs memory from banka will use that global banka buffer.
> >
> > Alternatively, you can set aside memory from a region specifically for drivers:
> > banka/foo=30MB. This allocated 30 MB from region banka specifically for driver foo.
> >
> > You can also share that with another driver:
> >
> > banka/foo,bar=30MB
> >
> > Now this 30 MB buffer is shared between drivers foo and bar.
> 
> Let me rephrase it to see if I got it correct:
> 
> You propose that platform will define what types of memory it has.  For instance
> banka for a the first bank, bankb for the second memory bank, dma for DMA-able
> memory, etc.  Those definitions would be merely informative and by themselves
> they would not reserve any memory.

Right. It might be an option that they do reserve a minimal amount of memory.
If you know that you always need at least X MB of memory to get the system
running, then that might be useful.
 
> Later, it would be possible to specify regions of memory of those types.  For
> instance:
> 
>    banka=20M; banka/foo,bar=30M
> 
> would register two regions in the memory type "banka" such that the first is 20 MiB
> and used by all drivers expect for driver foo and bar which would use the second
> region of 30 MiB?

Right.
 
> > The nice thing about this is that the driver will still only refer to region
> > banka as setup by the platform code.
> 
> So the driver would request a memory type "banka" and then get a chunk from one of
> the abovementioned regions?

Right.
 
> I somehow like the simplicity of that but I see some disadvantages:
> 
> 1. Imagine a video decoder which for best performance should operate on some buffers
>     from the first and some buffers from the second bank.  However, if the buffers are
>     from the incorrect bank it will still work, only slower.  In such situations you
>     cannot specify that when driver foo requests memory type "banka" then it should
>     first try memory type "banka" but if allocation failed there try "bankb".

Not quite sure I understand the problem here. Isn't that something for the driver to
decide? If it can only work with buffers from banka, then it will just fail if it
cannot allocate the required buffers. On the other hand, if it can also work with
buffers from bankb if banka is full, then it can just use bankb as fallback.

This type of behavior is very much driver specific and as such should be done by
the driver and not through user supplied kernel parameters IMHO.
 
> 2. What if the device handled by the above driver were run on a platform with only
>     one memory bank?  The driver would still refer to "banka" and "bankb" but there
>     would be no such types in the system.

First of all, any driver that needs specific memory banks is highly platform
specific and is extremely unlikely to work anywhere else.

But this can also handled in the driver itself. Either through config #ifdefs or
by using e.g. a dma region as fallback.

> 3. What if there were one driver, initially written for platform X which used names
>     "banka" and "bankb", and another driver, initially written for platform Y which
>     used names "bank1" and "bank2".  How would you make them work on a single platform
>     with two memory banks?

Sorry, I don't understand the question. I think I would refer to my answer to the
previous question, but I'm not sure if that covers this.
 
> 4. This is hypothetical, but the "kind" defined by CMA could be used to specify
>     characteristics that are not hardware dependent.  For instance some driver
>     could use kind "bulk" for some big, several MiB buffers and "control" for
>     small less then MiB buffers.  Regions for those kinds could be of the same
>     type of memory but it could be valuable splitting those to two regions to
>     minimise fragmentation.

That's actually a good point. I can imagine this.

I would not implement this for a first version. But one way this could be done is
by something like this:

dma/foo(kind)=20MB where '(kind)' is optional. The big problem I have with this
is that this means that you need to know what 'kinds' of memory a particular
driver needs.

This can always be added later. For an initial release I wouldn't do this.
 
> > And in the more general case you can have two standard regions: dma and common.
> > So drivers can rely on the presence of a dma region when allocating buffers.
> 
> I think that driver should not care about or know region names at all.

A region is very similar to the last argument to kmalloc. And drivers most definitely
need to know about regions, just like they need to specify the correct GFP flags.

In fact, it's the only thing that need to know.
 
> > What would make this even better is that CMA has the option to try and allocate
> > additional memory on the fly if its memory pool becomes empty. E.g. if the dma
> > pool is full, then it can try to do a kmalloc(..., GFP_KERNEL | __GFP_DMA).
> 
> As I've said somewhere above, I was thinking about something like it.

Cool.
 
> > This allows you to setup the dma and common regions with size 0. So allocating
> > from the dma region would effectively be the same as doing a kmalloc. Unless
> > the user sets up a dma area in the kernel parameters.
> >
> > Obviously this is probably impossible if you need memory from specific memory
> > banks, so this is something that is not available for every region.
> >
> > The nice thing about this is that it is very flexible for end users. For example,
> > most users of the ivtv driver wouldn't have to do anything since most of the time
> > it is able to assign the necessary buffers. But some users have multiple ivtv-based
> > capture boards in their PC, and then it can become harder to have ivtv obtain the
> > needed buffers. In that case they can preallocate the buffers by setting
> > dma/ivtv=500MB or something like that.
> >
> > That would be a really nice feature...
> 
> I think the main difference between your proposal and what is in CMA is that you
> propose that platform define types of memory and later on user will be able to
> define regions of given type of memory.  This means that drivers would have to
> be aware of the names of the types and specify the type name witch each allocation.
> 
> The CMA solution however, lets drivers define their own kinds of memory and later
> on platform initialisation code map drivers with their kinds to regions.
> 
> Have I got it right?

I think so, yes. The disadvantage of the CMA solution is that if you have a number
of drivers, each with their own kinds of memory, you get very complex mappings. And
remember that these drivers are not limited to the hardware inside the SoC, but can
also include e.g. USB drivers. You can't predict what USB device the end user will
connect to the device, so you would have to be able to handle any mapping that any
USB driver might need.

I really think this is the wrong approach.

Regards,

	Hans

-- 
Hans Verkuil - video4linux developer - sponsored by TANDBERG, part of Cisco

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-08-03  7:19                   ` Hans Verkuil
  0 siblings, 0 replies; 57+ messages in thread
From: Hans Verkuil @ 2010-08-03  7:19 UTC (permalink / raw)
  To: Michał Nazarewicz
  Cc: Marek Szyprowski, 'Daniel Walker',
	'Jonathan Corbet', Pawel Osciak, 'Mark Brown',
	linux-kernel, 'Hiremath Vaibhav',
	'FUJITA Tomonori', linux-mm, 'Kyungmin Park',
	'Zach Pfeffer',
	linux-arm-kernel, linux-media

On Monday 02 August 2010 17:51:39 Michał Nazarewicz wrote:

<snip>

> >> We don't want to allocate X buffers of Y MB memory each on boot. Instead we
> >> want to just reserve XX MB memory and then dynamically allocate buffers from
> >> it. This enables us to perform the following 2 tasks:
> >> 1. movie decoding in HD-quality (only one instance)
> >> 2. two instances of SD-quality movie decoding and SD-quality move encoding
> >>    (example: video conference)
> >>
> >> We know that these two use cases are exclusive, so they can use the same
> >> reserved memory.
> 
> > When I said 'allocating X buffers of Y MB memory' I meant that you need to
> > allocate a known amount of memory (X * Y MB in this case). So the boot args
> > say e.g. dma=40MB and the driver just allocates X buffers from that region.
> 
> But the point is that driver does not allocate memory at boot time.  If video
> codec would allocate memory at boot time no one else could use it even if the
> codec is not used.  The idea is to make other devices use the memory when
> codec is idle.  For instance, one could work on huge JPEG images and need
> buffers for a hardware JPEG codec.
> 
> Or have I misunderstood your point?

I think we are talking about the same thing. A region like dma=40MB would be
shared by all drivers that want to allocate from it.
 
> > Switching to SD quality requires releasing those buffers and instead allocating
> > a number of smaller buffers from the same region.
> 
> Our intention is that driver would allocate buffers only when needed so the buffers
> would be freed when video codec driver is released.  So when the device is opened
> (whatever that would mean for a particular device) it would allocate enough memory
> for the requested task.

Right.

> > For these use-cases the allocator can be very simple and probably covers most
> > use-cases.
> 
> Yes, this is our experience.  The best-fit algorithm, even though simple, seem to
> handle use cases tested on our system with little fragmentation.

That's what I expected as well.

> > Anyway, I'm no expert on memory allocators and people on the linux-mm list are
> > no doubt much more qualified to discuss this. My main concern is that of
> > trying to add too much for a first release. It is simply easier to start simple
> > and extend as needed. That makes it easier to be accepted in the mainline.
> 
> I'm trying to keep it as simple as possible :) still making it useful for us.
> 
> In particular we need a way to specify where different regions reside (different
> memory banks, etc.) as well as specify which drivers should use which regions.
> What's more, we need the notion of a "kind" of memory as one driver may need
> memory buffers from different regions (ie. MFC needs to allocate buffers from
> both banks).
> 
> >>>>>> +    2. CMA allows a run-time configuration of the memory regions it
> >>>>>> +       will use to allocate chunks of memory from.  The set of memory
> >>>>>> +       regions is given on command line so it can be easily changed
> >>>>>> +       without the need for recompiling the kernel.
> >>>>>> +
> >>>>>> +       Each region has it's own size, alignment demand, a start
> >>>>>> +       address (physical address where it should be placed) and an
> >>>>>> +       allocator algorithm assigned to the region.
> >>>>>> +
> >>>>>> +       This means that there can be different algorithms running at
> >>>>>> +       the same time, if different devices on the platform have
> >>>>>> +       distinct memory usage characteristics and different algorithm
> >>>>>> +       match those the best way.
> 
> >>>>> Seems overengineering to me. Just ensure that the code can be extended
> >>>>> later to such hypothetical scenarios. They are hypothetical, right?
> 
> 1. Everyone seem to hate the command line interface that was present in the
>     first and second version of the patch.  As such, I've made it optional
>     (via Kconfig option) in the third version (not posted yet), which
>     unfortunately makes platform initialisation code longer and more
>     complicated but hopefully more people will be happy. ;)

I strongly recommend that it is simple dropped from the first version. That
will increase the chances of getting it merged. And once merged, interfaces
like this can be discussed at leisure.
 
> 2. We need to specify size, alignment and start address so those are not
>     hypothetical.

Agreed. But this is platform code, you should not have to pass this info
through boot args.
 
> 3. The algorithms are somehow hypothetical (we haven't tried using a different
>     allocator as of you) but I think it's much easier to design the whole system
>     with them in mind and implement them in the first version then later add code
>     for them.

I agree with that as long as the extra code needed to do so it within limits.
I've seen too often that people design for a future that never happens. That
leads to code that is never used and will make it hard to future generations
of developers to figure out what the purpose was of that code.

<snip>

> > Regarding regions and shared and per-driver buffers: I've been thinking about
> > this a bit more and I have a proposal of my own.
> >
> > There are two different aspects to this: first there is the hardware aspect: if
> > the hardware needs memory from specific memory banks or with specific requirements
> > (e.g. DMAable), then those regions should be setup in the platform code. There you
> > know the memory sizes/alignments/etc. since that is hw dependent. The other reason
> > is that drivers need to be able to tell CMA that they need to allocate from such
> > regions.  You can't have a driver refer to a region that is specified through
> > kernel parameters, that would create a very ugly dependency.
> >
> > The other aspect is how to setup buffers. A global buffer is simply setup by
> > assigning a size to the region: "banka=20MB". Unless specified otherwise any
> > driver that needs memory from banka will use that global banka buffer.
> >
> > Alternatively, you can set aside memory from a region specifically for drivers:
> > banka/foo=30MB. This allocated 30 MB from region banka specifically for driver foo.
> >
> > You can also share that with another driver:
> >
> > banka/foo,bar=30MB
> >
> > Now this 30 MB buffer is shared between drivers foo and bar.
> 
> Let me rephrase it to see if I got it correct:
> 
> You propose that platform will define what types of memory it has.  For instance
> banka for a the first bank, bankb for the second memory bank, dma for DMA-able
> memory, etc.  Those definitions would be merely informative and by themselves
> they would not reserve any memory.

Right. It might be an option that they do reserve a minimal amount of memory.
If you know that you always need at least X MB of memory to get the system
running, then that might be useful.
 
> Later, it would be possible to specify regions of memory of those types.  For
> instance:
> 
>    banka=20M; banka/foo,bar=30M
> 
> would register two regions in the memory type "banka" such that the first is 20 MiB
> and used by all drivers expect for driver foo and bar which would use the second
> region of 30 MiB?

Right.
 
> > The nice thing about this is that the driver will still only refer to region
> > banka as setup by the platform code.
> 
> So the driver would request a memory type "banka" and then get a chunk from one of
> the abovementioned regions?

Right.
 
> I somehow like the simplicity of that but I see some disadvantages:
> 
> 1. Imagine a video decoder which for best performance should operate on some buffers
>     from the first and some buffers from the second bank.  However, if the buffers are
>     from the incorrect bank it will still work, only slower.  In such situations you
>     cannot specify that when driver foo requests memory type "banka" then it should
>     first try memory type "banka" but if allocation failed there try "bankb".

Not quite sure I understand the problem here. Isn't that something for the driver to
decide? If it can only work with buffers from banka, then it will just fail if it
cannot allocate the required buffers. On the other hand, if it can also work with
buffers from bankb if banka is full, then it can just use bankb as fallback.

This type of behavior is very much driver specific and as such should be done by
the driver and not through user supplied kernel parameters IMHO.
 
> 2. What if the device handled by the above driver were run on a platform with only
>     one memory bank?  The driver would still refer to "banka" and "bankb" but there
>     would be no such types in the system.

First of all, any driver that needs specific memory banks is highly platform
specific and is extremely unlikely to work anywhere else.

But this can also handled in the driver itself. Either through config #ifdefs or
by using e.g. a dma region as fallback.

> 3. What if there were one driver, initially written for platform X which used names
>     "banka" and "bankb", and another driver, initially written for platform Y which
>     used names "bank1" and "bank2".  How would you make them work on a single platform
>     with two memory banks?

Sorry, I don't understand the question. I think I would refer to my answer to the
previous question, but I'm not sure if that covers this.
 
> 4. This is hypothetical, but the "kind" defined by CMA could be used to specify
>     characteristics that are not hardware dependent.  For instance some driver
>     could use kind "bulk" for some big, several MiB buffers and "control" for
>     small less then MiB buffers.  Regions for those kinds could be of the same
>     type of memory but it could be valuable splitting those to two regions to
>     minimise fragmentation.

That's actually a good point. I can imagine this.

I would not implement this for a first version. But one way this could be done is
by something like this:

dma/foo(kind)=20MB where '(kind)' is optional. The big problem I have with this
is that this means that you need to know what 'kinds' of memory a particular
driver needs.

This can always be added later. For an initial release I wouldn't do this.
 
> > And in the more general case you can have two standard regions: dma and common.
> > So drivers can rely on the presence of a dma region when allocating buffers.
> 
> I think that driver should not care about or know region names at all.

A region is very similar to the last argument to kmalloc. And drivers most definitely
need to know about regions, just like they need to specify the correct GFP flags.

In fact, it's the only thing that need to know.
 
> > What would make this even better is that CMA has the option to try and allocate
> > additional memory on the fly if its memory pool becomes empty. E.g. if the dma
> > pool is full, then it can try to do a kmalloc(..., GFP_KERNEL | __GFP_DMA).
> 
> As I've said somewhere above, I was thinking about something like it.

Cool.
 
> > This allows you to setup the dma and common regions with size 0. So allocating
> > from the dma region would effectively be the same as doing a kmalloc. Unless
> > the user sets up a dma area in the kernel parameters.
> >
> > Obviously this is probably impossible if you need memory from specific memory
> > banks, so this is something that is not available for every region.
> >
> > The nice thing about this is that it is very flexible for end users. For example,
> > most users of the ivtv driver wouldn't have to do anything since most of the time
> > it is able to assign the necessary buffers. But some users have multiple ivtv-based
> > capture boards in their PC, and then it can become harder to have ivtv obtain the
> > needed buffers. In that case they can preallocate the buffers by setting
> > dma/ivtv=500MB or something like that.
> >
> > That would be a really nice feature...
> 
> I think the main difference between your proposal and what is in CMA is that you
> propose that platform define types of memory and later on user will be able to
> define regions of given type of memory.  This means that drivers would have to
> be aware of the names of the types and specify the type name witch each allocation.
> 
> The CMA solution however, lets drivers define their own kinds of memory and later
> on platform initialisation code map drivers with their kinds to regions.
> 
> Have I got it right?

I think so, yes. The disadvantage of the CMA solution is that if you have a number
of drivers, each with their own kinds of memory, you get very complex mappings. And
remember that these drivers are not limited to the hardware inside the SoC, but can
also include e.g. USB drivers. You can't predict what USB device the end user will
connect to the device, so you would have to be able to handle any mapping that any
USB driver might need.

I really think this is the wrong approach.

Regards,

	Hans

-- 
Hans Verkuil - video4linux developer - sponsored by TANDBERG, part of Cisco

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-08-03  7:19                   ` Hans Verkuil
  0 siblings, 0 replies; 57+ messages in thread
From: Hans Verkuil @ 2010-08-03  7:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Monday 02 August 2010 17:51:39 Micha? Nazarewicz wrote:

<snip>

> >> We don't want to allocate X buffers of Y MB memory each on boot. Instead we
> >> want to just reserve XX MB memory and then dynamically allocate buffers from
> >> it. This enables us to perform the following 2 tasks:
> >> 1. movie decoding in HD-quality (only one instance)
> >> 2. two instances of SD-quality movie decoding and SD-quality move encoding
> >>    (example: video conference)
> >>
> >> We know that these two use cases are exclusive, so they can use the same
> >> reserved memory.
> 
> > When I said 'allocating X buffers of Y MB memory' I meant that you need to
> > allocate a known amount of memory (X * Y MB in this case). So the boot args
> > say e.g. dma=40MB and the driver just allocates X buffers from that region.
> 
> But the point is that driver does not allocate memory at boot time.  If video
> codec would allocate memory at boot time no one else could use it even if the
> codec is not used.  The idea is to make other devices use the memory when
> codec is idle.  For instance, one could work on huge JPEG images and need
> buffers for a hardware JPEG codec.
> 
> Or have I misunderstood your point?

I think we are talking about the same thing. A region like dma=40MB would be
shared by all drivers that want to allocate from it.
 
> > Switching to SD quality requires releasing those buffers and instead allocating
> > a number of smaller buffers from the same region.
> 
> Our intention is that driver would allocate buffers only when needed so the buffers
> would be freed when video codec driver is released.  So when the device is opened
> (whatever that would mean for a particular device) it would allocate enough memory
> for the requested task.

Right.

> > For these use-cases the allocator can be very simple and probably covers most
> > use-cases.
> 
> Yes, this is our experience.  The best-fit algorithm, even though simple, seem to
> handle use cases tested on our system with little fragmentation.

That's what I expected as well.

> > Anyway, I'm no expert on memory allocators and people on the linux-mm list are
> > no doubt much more qualified to discuss this. My main concern is that of
> > trying to add too much for a first release. It is simply easier to start simple
> > and extend as needed. That makes it easier to be accepted in the mainline.
> 
> I'm trying to keep it as simple as possible :) still making it useful for us.
> 
> In particular we need a way to specify where different regions reside (different
> memory banks, etc.) as well as specify which drivers should use which regions.
> What's more, we need the notion of a "kind" of memory as one driver may need
> memory buffers from different regions (ie. MFC needs to allocate buffers from
> both banks).
> 
> >>>>>> +    2. CMA allows a run-time configuration of the memory regions it
> >>>>>> +       will use to allocate chunks of memory from.  The set of memory
> >>>>>> +       regions is given on command line so it can be easily changed
> >>>>>> +       without the need for recompiling the kernel.
> >>>>>> +
> >>>>>> +       Each region has it's own size, alignment demand, a start
> >>>>>> +       address (physical address where it should be placed) and an
> >>>>>> +       allocator algorithm assigned to the region.
> >>>>>> +
> >>>>>> +       This means that there can be different algorithms running at
> >>>>>> +       the same time, if different devices on the platform have
> >>>>>> +       distinct memory usage characteristics and different algorithm
> >>>>>> +       match those the best way.
> 
> >>>>> Seems overengineering to me. Just ensure that the code can be extended
> >>>>> later to such hypothetical scenarios. They are hypothetical, right?
> 
> 1. Everyone seem to hate the command line interface that was present in the
>     first and second version of the patch.  As such, I've made it optional
>     (via Kconfig option) in the third version (not posted yet), which
>     unfortunately makes platform initialisation code longer and more
>     complicated but hopefully more people will be happy. ;)

I strongly recommend that it is simple dropped from the first version. That
will increase the chances of getting it merged. And once merged, interfaces
like this can be discussed at leisure.
 
> 2. We need to specify size, alignment and start address so those are not
>     hypothetical.

Agreed. But this is platform code, you should not have to pass this info
through boot args.
 
> 3. The algorithms are somehow hypothetical (we haven't tried using a different
>     allocator as of you) but I think it's much easier to design the whole system
>     with them in mind and implement them in the first version then later add code
>     for them.

I agree with that as long as the extra code needed to do so it within limits.
I've seen too often that people design for a future that never happens. That
leads to code that is never used and will make it hard to future generations
of developers to figure out what the purpose was of that code.

<snip>

> > Regarding regions and shared and per-driver buffers: I've been thinking about
> > this a bit more and I have a proposal of my own.
> >
> > There are two different aspects to this: first there is the hardware aspect: if
> > the hardware needs memory from specific memory banks or with specific requirements
> > (e.g. DMAable), then those regions should be setup in the platform code. There you
> > know the memory sizes/alignments/etc. since that is hw dependent. The other reason
> > is that drivers need to be able to tell CMA that they need to allocate from such
> > regions.  You can't have a driver refer to a region that is specified through
> > kernel parameters, that would create a very ugly dependency.
> >
> > The other aspect is how to setup buffers. A global buffer is simply setup by
> > assigning a size to the region: "banka=20MB". Unless specified otherwise any
> > driver that needs memory from banka will use that global banka buffer.
> >
> > Alternatively, you can set aside memory from a region specifically for drivers:
> > banka/foo=30MB. This allocated 30 MB from region banka specifically for driver foo.
> >
> > You can also share that with another driver:
> >
> > banka/foo,bar=30MB
> >
> > Now this 30 MB buffer is shared between drivers foo and bar.
> 
> Let me rephrase it to see if I got it correct:
> 
> You propose that platform will define what types of memory it has.  For instance
> banka for a the first bank, bankb for the second memory bank, dma for DMA-able
> memory, etc.  Those definitions would be merely informative and by themselves
> they would not reserve any memory.

Right. It might be an option that they do reserve a minimal amount of memory.
If you know that you always need at least X MB of memory to get the system
running, then that might be useful.
 
> Later, it would be possible to specify regions of memory of those types.  For
> instance:
> 
>    banka=20M; banka/foo,bar=30M
> 
> would register two regions in the memory type "banka" such that the first is 20 MiB
> and used by all drivers expect for driver foo and bar which would use the second
> region of 30 MiB?

Right.
 
> > The nice thing about this is that the driver will still only refer to region
> > banka as setup by the platform code.
> 
> So the driver would request a memory type "banka" and then get a chunk from one of
> the abovementioned regions?

Right.
 
> I somehow like the simplicity of that but I see some disadvantages:
> 
> 1. Imagine a video decoder which for best performance should operate on some buffers
>     from the first and some buffers from the second bank.  However, if the buffers are
>     from the incorrect bank it will still work, only slower.  In such situations you
>     cannot specify that when driver foo requests memory type "banka" then it should
>     first try memory type "banka" but if allocation failed there try "bankb".

Not quite sure I understand the problem here. Isn't that something for the driver to
decide? If it can only work with buffers from banka, then it will just fail if it
cannot allocate the required buffers. On the other hand, if it can also work with
buffers from bankb if banka is full, then it can just use bankb as fallback.

This type of behavior is very much driver specific and as such should be done by
the driver and not through user supplied kernel parameters IMHO.
 
> 2. What if the device handled by the above driver were run on a platform with only
>     one memory bank?  The driver would still refer to "banka" and "bankb" but there
>     would be no such types in the system.

First of all, any driver that needs specific memory banks is highly platform
specific and is extremely unlikely to work anywhere else.

But this can also handled in the driver itself. Either through config #ifdefs or
by using e.g. a dma region as fallback.

> 3. What if there were one driver, initially written for platform X which used names
>     "banka" and "bankb", and another driver, initially written for platform Y which
>     used names "bank1" and "bank2".  How would you make them work on a single platform
>     with two memory banks?

Sorry, I don't understand the question. I think I would refer to my answer to the
previous question, but I'm not sure if that covers this.
 
> 4. This is hypothetical, but the "kind" defined by CMA could be used to specify
>     characteristics that are not hardware dependent.  For instance some driver
>     could use kind "bulk" for some big, several MiB buffers and "control" for
>     small less then MiB buffers.  Regions for those kinds could be of the same
>     type of memory but it could be valuable splitting those to two regions to
>     minimise fragmentation.

That's actually a good point. I can imagine this.

I would not implement this for a first version. But one way this could be done is
by something like this:

dma/foo(kind)=20MB where '(kind)' is optional. The big problem I have with this
is that this means that you need to know what 'kinds' of memory a particular
driver needs.

This can always be added later. For an initial release I wouldn't do this.
 
> > And in the more general case you can have two standard regions: dma and common.
> > So drivers can rely on the presence of a dma region when allocating buffers.
> 
> I think that driver should not care about or know region names at all.

A region is very similar to the last argument to kmalloc. And drivers most definitely
need to know about regions, just like they need to specify the correct GFP flags.

In fact, it's the only thing that need to know.
 
> > What would make this even better is that CMA has the option to try and allocate
> > additional memory on the fly if its memory pool becomes empty. E.g. if the dma
> > pool is full, then it can try to do a kmalloc(..., GFP_KERNEL | __GFP_DMA).
> 
> As I've said somewhere above, I was thinking about something like it.

Cool.
 
> > This allows you to setup the dma and common regions with size 0. So allocating
> > from the dma region would effectively be the same as doing a kmalloc. Unless
> > the user sets up a dma area in the kernel parameters.
> >
> > Obviously this is probably impossible if you need memory from specific memory
> > banks, so this is something that is not available for every region.
> >
> > The nice thing about this is that it is very flexible for end users. For example,
> > most users of the ivtv driver wouldn't have to do anything since most of the time
> > it is able to assign the necessary buffers. But some users have multiple ivtv-based
> > capture boards in their PC, and then it can become harder to have ivtv obtain the
> > needed buffers. In that case they can preallocate the buffers by setting
> > dma/ivtv=500MB or something like that.
> >
> > That would be a really nice feature...
> 
> I think the main difference between your proposal and what is in CMA is that you
> propose that platform define types of memory and later on user will be able to
> define regions of given type of memory.  This means that drivers would have to
> be aware of the names of the types and specify the type name witch each allocation.
> 
> The CMA solution however, lets drivers define their own kinds of memory and later
> on platform initialisation code map drivers with their kinds to regions.
> 
> Have I got it right?

I think so, yes. The disadvantage of the CMA solution is that if you have a number
of drivers, each with their own kinds of memory, you get very complex mappings. And
remember that these drivers are not limited to the hardware inside the SoC, but can
also include e.g. USB drivers. You can't predict what USB device the end user will
connect to the device, so you would have to be able to handle any mapping that any
USB driver might need.

I really think this is the wrong approach.

Regards,

	Hans

-- 
Hans Verkuil - video4linux developer - sponsored by TANDBERG, part of Cisco

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
  2010-08-03  7:19                   ` Hans Verkuil
  (?)
@ 2010-08-06 13:31                     ` Michał Nazarewicz
  -1 siblings, 0 replies; 57+ messages in thread
From: Michał Nazarewicz @ 2010-08-06 13:31 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: Marek Szyprowski, 'Daniel Walker',
	'Jonathan Corbet', Pawel Osciak, 'Mark Brown',
	linux-kernel, 'Hiremath Vaibhav',
	'FUJITA Tomonori', linux-mm, 'Kyungmin Park',
	'Zach Pfeffer',
	linux-arm-kernel, linux-media

Hello Hans,

I've just posted updated patchset.  It changes the way regions are
reserved somehow so our discussion is not entirely applicable to a
new version I think.

I preserved the original "map" there.  I came to a conclusion that
your approach is not that different from what I had in mind but I
noticed that with your syntax it's impossible to specify the order
of regions to try. For instance that driver should first try region
"foo" and then region "bar" and not the other way around.

I'm looking forward to hearing your comments on the newest version
of CMA.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Michał "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-08-06 13:31                     ` Michał Nazarewicz
  0 siblings, 0 replies; 57+ messages in thread
From: Michał Nazarewicz @ 2010-08-06 13:31 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: Marek Szyprowski, 'Daniel Walker',
	'Jonathan Corbet', Pawel Osciak, 'Mark Brown',
	linux-kernel, 'Hiremath Vaibhav',
	'FUJITA Tomonori', linux-mm, 'Kyungmin Park',
	'Zach Pfeffer',
	linux-arm-kernel, linux-media

Hello Hans,

I've just posted updated patchset.  It changes the way regions are
reserved somehow so our discussion is not entirely applicable to a
new version I think.

I preserved the original "map" there.  I came to a conclusion that
your approach is not that different from what I had in mind but I
noticed that with your syntax it's impossible to specify the order
of regions to try. For instance that driver should first try region
"foo" and then region "bar" and not the other way around.

I'm looking forward to hearing your comments on the newest version
of CMA.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Michał "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added
@ 2010-08-06 13:31                     ` Michał Nazarewicz
  0 siblings, 0 replies; 57+ messages in thread
From: Michał Nazarewicz @ 2010-08-06 13:31 UTC (permalink / raw)
  To: linux-arm-kernel

Hello Hans,

I've just posted updated patchset.  It changes the way regions are
reserved somehow so our discussion is not entirely applicable to a
new version I think.

I preserved the original "map" there.  I came to a conclusion that
your approach is not that different from what I had in mind but I
noticed that with your syntax it's impossible to specify the order
of regions to try. For instance that driver should first try region
"foo" and then region "bar" and not the other way around.

I'm looking forward to hearing your comments on the newest version
of CMA.

-- 
Best regards,                                        _     _
| Humble Liege of Serenely Enlightened Majesty of  o' \,=./ `o
| Computer Science,  Micha? "mina86" Nazarewicz       (o o)
+----[mina86*mina86.com]---[mina86*jabber.org]----ooO--(_)--Ooo--

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2010-08-06 13:31 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-26 14:11 [PATCHv2 0/4] The Contiguous Memory Allocator Michal Nazarewicz
2010-07-26 14:40 ` Michal Nazarewicz
2010-07-26 14:11 ` Michal Nazarewicz
2010-07-26 14:11 ` [PATCHv2 1/4] lib: rbtree: rb_root_init() function added Michal Nazarewicz
2010-07-26 14:40   ` Michal Nazarewicz
2010-07-26 14:11   ` Michal Nazarewicz
2010-07-26 14:11   ` [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added Michal Nazarewicz
2010-07-26 14:40     ` Michal Nazarewicz
2010-07-26 14:11     ` Michal Nazarewicz
2010-07-26 14:11     ` [PATCHv2 3/4] mm: cma: Test device and application added Michal Nazarewicz
2010-07-26 14:40       ` Michal Nazarewicz
2010-07-26 14:11       ` Michal Nazarewicz
2010-07-26 14:11       ` [PATCHv2 4/4] arm: Added CMA to Aquila and Goni Michal Nazarewicz
2010-07-26 14:40         ` Michal Nazarewicz
2010-07-26 14:11         ` Michal Nazarewicz
2010-07-26 20:28     ` [PATCHv2 2/4] mm: cma: Contiguous Memory Allocator added Hans Verkuil
2010-07-26 20:28       ` Hans Verkuil
2010-07-26 20:28       ` Hans Verkuil
2010-07-27  7:41       ` Marek Szyprowski
2010-07-27  7:41         ` Marek Szyprowski
2010-07-27  7:41         ` Marek Szyprowski
2010-07-27 16:27         ` Hans Verkuil
2010-07-27 16:27           ` Hans Verkuil
2010-07-27 16:27           ` Hans Verkuil
2010-07-28  9:04           ` Marek Szyprowski
2010-07-28  9:04             ` Marek Szyprowski
2010-07-28  9:04             ` Marek Szyprowski
2010-08-01 13:26             ` Hans Verkuil
2010-08-01 13:26               ` Hans Verkuil
2010-08-01 13:26               ` Hans Verkuil
2010-08-02 15:51               ` Michał Nazarewicz
2010-08-02 15:51                 ` Michał Nazarewicz
2010-08-02 15:51                 ` Michał Nazarewicz
2010-08-03  7:19                 ` Hans Verkuil
2010-08-03  7:19                   ` Hans Verkuil
2010-08-03  7:19                   ` Hans Verkuil
2010-08-06 13:31                   ` Michał Nazarewicz
2010-08-06 13:31                     ` Michał Nazarewicz
2010-08-06 13:31                     ` Michał Nazarewicz
2010-07-27 12:08     ` Russell King - ARM Linux
2010-07-27 12:08       ` Russell King - ARM Linux
2010-07-27 12:08       ` Russell King - ARM Linux
2010-07-27 12:45       ` Marek Szyprowski
2010-07-27 12:45         ` Marek Szyprowski
2010-07-27 12:45         ` Marek Szyprowski
2010-07-27 12:58         ` Jonathan Corbet
2010-07-27 12:58           ` Jonathan Corbet
2010-07-27 12:58           ` Jonathan Corbet
2010-07-27 13:46           ` Marek Szyprowski
2010-07-27 13:46             ` Marek Szyprowski
2010-07-27 13:46             ` Marek Szyprowski
2010-07-27 14:21           ` FUJITA Tomonori
2010-07-27 14:21             ` FUJITA Tomonori
2010-07-27 14:21             ` FUJITA Tomonori
2010-07-28  8:53       ` Michał Nazarewicz
2010-07-28  8:53         ` Michał Nazarewicz
2010-07-28  8:53         ` Michał Nazarewicz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.