Linux-Devicetree Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v4 0/4] Chunk Heap Support on DMA-HEAP
@ 2021-01-21 17:54 Minchan Kim
  2021-01-21 17:54 ` [PATCH v4 1/4] mm: cma: introduce gfp flag in cma_alloc instead of no_warn Minchan Kim
                   ` (3 more replies)
  0 siblings, 4 replies; 31+ messages in thread
From: Minchan Kim @ 2021-01-21 17:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, hyesoo.yu, david, mhocko, surenb, pullip.cho,
	joaodias, hridya, john.stultz, sumit.semwal, linux-media,
	devicetree, hch, robh+dt, linaro-mm-sig, Minchan Kim

This patchset introduces a new dma heap, "chunk-heap" that makes it
easy to perform the bulk allocation of high order pages.
It has been created to help optimize the 4K/8K HDR video playback
with secure DRM HW to protect contents on memory. The HW needs
physically contiguous memory chunks(e.g, 64K) up to several hundred
MB memory.

To make such high-order big bulk allocations work, chunk-heap uses
CMA area. To avoid CMA allocation long stall on blocking pages(e.g.,
page writeback and/or page locking), it uses failfast mode of the
CMA API(i.e., __GFP_NORETRY) so it will continue to find easy
migratable pages in different pageblocks without stalling. At last
resort, it will allow the blocking only if it couldn't find the
available memory in the end.

First two patches introduces the failfast mode as __GFP_NORETRY
in alloc_contig_range and the allow to use it from the CMA API.
Third patch introduces device tree syntax for chunk-heap to bind
the specific CMA area with chunk-heap.
Finally, last patch implements chunk-heap as dma-buf heap.

* since v3 - https://lore.kernel.org/linux-mm/20210113012143.1201105-1-minchan@kernel.org/
  * use prefix for chunk-name - John
  * fix yamllint error - Rob
  * add reviewed-by - Suren

* since v2 - https://lore.kernel.org/linux-mm/20201201175144.3996569-1-minchan@kernel.org/
  * introduce gfp_mask with __GFP_NORETRY on cma_alloc - Michal
  * do not expoert CMA APIs - Christoph
  * use compatible string for DT instead of dma-heap specific property - Hridya

* Since v1 - https://lore.kernel.org/linux-mm/20201117181935.3613581-1-minchan@kernel.org/
  * introduce alloc_contig_mode - David
  * use default CMA instead of device tree - John

Hyesoo Yu (2):
  dt-bindings: reserved-memory: Make DMA-BUF CMA heap DT-configurable
  dma-buf: heaps: add chunk heap to dmabuf heaps

Minchan Kim (2):
  mm: cma: introduce gfp flag in cma_alloc instead of no_warn
  mm: failfast mode with __GFP_NORETRY in alloc_contig_range

 .../reserved-memory/dma_heap_chunk.yaml       |  56 ++
 drivers/dma-buf/heaps/Kconfig                 |   8 +
 drivers/dma-buf/heaps/Makefile                |   1 +
 drivers/dma-buf/heaps/chunk_heap.c            | 492 ++++++++++++++++++
 drivers/dma-buf/heaps/cma_heap.c              |   2 +-
 drivers/s390/char/vmcp.c                      |   2 +-
 include/linux/cma.h                           |   2 +-
 kernel/dma/contiguous.c                       |   3 +-
 mm/cma.c                                      |  12 +-
 mm/cma_debug.c                                |   2 +-
 mm/hugetlb.c                                  |   6 +-
 mm/page_alloc.c                               |   8 +-
 mm/secretmem.c                                |   3 +-
 13 files changed, 581 insertions(+), 16 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/reserved-memory/dma_heap_chunk.yaml
 create mode 100644 drivers/dma-buf/heaps/chunk_heap.c

-- 
2.30.0.296.g2bfb1c46d8-goog


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v4 1/4] mm: cma: introduce gfp flag in cma_alloc instead of no_warn
  2021-01-21 17:54 [PATCH v4 0/4] Chunk Heap Support on DMA-HEAP Minchan Kim
@ 2021-01-21 17:54 ` Minchan Kim
  2021-01-21 18:46   ` David Hildenbrand
                     ` (2 more replies)
  2021-01-21 17:55 ` [PATCH v4 2/4] mm: failfast mode with __GFP_NORETRY in alloc_contig_range Minchan Kim
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 31+ messages in thread
From: Minchan Kim @ 2021-01-21 17:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, hyesoo.yu, david, mhocko, surenb, pullip.cho,
	joaodias, hridya, john.stultz, sumit.semwal, linux-media,
	devicetree, hch, robh+dt, linaro-mm-sig, Minchan Kim

The upcoming patch will introduce __GFP_NORETRY semantic
in alloc_contig_range which is a failfast mode of the API.
Instead of adding a additional parameter for gfp, replace
no_warn with gfp flag.

To keep old behaviors, it follows the rule below.

  no_warn 			gfp_flags

  false         		GFP_KERNEL
  true          		GFP_KERNEL|__GFP_NOWARN
  gfp & __GFP_NOWARN		GFP_KERNEL | (gfp & __GFP_NOWARN)

Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 drivers/dma-buf/heaps/cma_heap.c |  2 +-
 drivers/s390/char/vmcp.c         |  2 +-
 include/linux/cma.h              |  2 +-
 kernel/dma/contiguous.c          |  3 ++-
 mm/cma.c                         | 12 ++++++------
 mm/cma_debug.c                   |  2 +-
 mm/hugetlb.c                     |  6 ++++--
 mm/secretmem.c                   |  3 ++-
 8 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c
index 364fc2f3e499..0afc1907887a 100644
--- a/drivers/dma-buf/heaps/cma_heap.c
+++ b/drivers/dma-buf/heaps/cma_heap.c
@@ -298,7 +298,7 @@ static int cma_heap_allocate(struct dma_heap *heap,
 	if (align > CONFIG_CMA_ALIGNMENT)
 		align = CONFIG_CMA_ALIGNMENT;
 
-	cma_pages = cma_alloc(cma_heap->cma, pagecount, align, false);
+	cma_pages = cma_alloc(cma_heap->cma, pagecount, align, GFP_KERNEL);
 	if (!cma_pages)
 		goto free_buffer;
 
diff --git a/drivers/s390/char/vmcp.c b/drivers/s390/char/vmcp.c
index 9e066281e2d0..78f9adf56456 100644
--- a/drivers/s390/char/vmcp.c
+++ b/drivers/s390/char/vmcp.c
@@ -70,7 +70,7 @@ static void vmcp_response_alloc(struct vmcp_session *session)
 	 * anymore the system won't work anyway.
 	 */
 	if (order > 2)
-		page = cma_alloc(vmcp_cma, nr_pages, 0, false);
+		page = cma_alloc(vmcp_cma, nr_pages, 0, GFP_KERNEL);
 	if (page) {
 		session->response = (char *)page_to_phys(page);
 		session->cma_alloc = 1;
diff --git a/include/linux/cma.h b/include/linux/cma.h
index 217999c8a762..d6c02d08ddbc 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -45,7 +45,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
 					const char *name,
 					struct cma **res_cma);
 extern struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
-			      bool no_warn);
+			      gfp_t gfp_mask);
 extern bool cma_release(struct cma *cma, const struct page *pages, unsigned int count);
 
 extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data);
diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index 3d63d91cba5c..552ed531c018 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -260,7 +260,8 @@ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
 	if (align > CONFIG_CMA_ALIGNMENT)
 		align = CONFIG_CMA_ALIGNMENT;
 
-	return cma_alloc(dev_get_cma_area(dev), count, align, no_warn);
+	return cma_alloc(dev_get_cma_area(dev), count, align, GFP_KERNEL |
+			(no_warn ? __GFP_NOWARN : 0));
 }
 
 /**
diff --git a/mm/cma.c b/mm/cma.c
index 0ba69cd16aeb..d50627686fec 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -419,13 +419,13 @@ static inline void cma_debug_show_areas(struct cma *cma) { }
  * @cma:   Contiguous memory region for which the allocation is performed.
  * @count: Requested number of pages.
  * @align: Requested alignment of pages (in PAGE_SIZE order).
- * @no_warn: Avoid printing message about failed allocation
+ * @gfp_mask: GFP mask to use during the cma allocation.
  *
  * This function allocates part of contiguous memory on specific
  * contiguous memory area.
  */
 struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
-		       bool no_warn)
+		       gfp_t gfp_mask)
 {
 	unsigned long mask, offset;
 	unsigned long pfn = -1;
@@ -438,8 +438,8 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 	if (!cma || !cma->count || !cma->bitmap)
 		return NULL;
 
-	pr_debug("%s(cma %p, count %zu, align %d)\n", __func__, (void *)cma,
-		 count, align);
+	pr_debug("%s(cma %p, count %zu, align %d gfp_mask 0x%x)\n", __func__,
+			(void *)cma, count, align, gfp_mask);
 
 	if (!count)
 		return NULL;
@@ -471,7 +471,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 
 		pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
 		ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA,
-				     GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0));
+						gfp_mask);
 
 		if (ret == 0) {
 			page = pfn_to_page(pfn);
@@ -500,7 +500,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 			page_kasan_tag_reset(page + i);
 	}
 
-	if (ret && !no_warn) {
+	if (ret && !(gfp_mask & __GFP_NOWARN)) {
 		pr_err("%s: alloc failed, req-size: %zu pages, ret: %d\n",
 			__func__, count, ret);
 		cma_debug_show_areas(cma);
diff --git a/mm/cma_debug.c b/mm/cma_debug.c
index d5bf8aa34fdc..00170c41cf81 100644
--- a/mm/cma_debug.c
+++ b/mm/cma_debug.c
@@ -137,7 +137,7 @@ static int cma_alloc_mem(struct cma *cma, int count)
 	if (!mem)
 		return -ENOMEM;
 
-	p = cma_alloc(cma, count, 0, false);
+	p = cma_alloc(cma, count, 0, GFP_KERNEL);
 	if (!p) {
 		kfree(mem);
 		return -ENOMEM;
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a6bad1f686c5..4209a2ed1e1b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1266,7 +1266,8 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
 
 		if (hugetlb_cma[nid]) {
 			page = cma_alloc(hugetlb_cma[nid], nr_pages,
-					huge_page_order(h), true);
+					huge_page_order(h),
+					GFP_KERNEL | __GFP_NOWARN);
 			if (page)
 				return page;
 		}
@@ -1277,7 +1278,8 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
 					continue;
 
 				page = cma_alloc(hugetlb_cma[node], nr_pages,
-						huge_page_order(h), true);
+						huge_page_order(h),
+						GFP_KERNEL | __GFP_NOWARN);
 				if (page)
 					return page;
 			}
diff --git a/mm/secretmem.c b/mm/secretmem.c
index b8a32954ac68..585d55b9f9d8 100644
--- a/mm/secretmem.c
+++ b/mm/secretmem.c
@@ -86,7 +86,8 @@ static int secretmem_pool_increase(struct secretmem_ctx *ctx, gfp_t gfp)
 	struct page *page;
 	int err;
 
-	page = cma_alloc(secretmem_cma, nr_pages, PMD_SIZE, gfp & __GFP_NOWARN);
+	page = cma_alloc(secretmem_cma, nr_pages, PMD_SIZE,
+				GFP_KERNEL | (gfp & __GFP_NOWARN));
 	if (!page)
 		return -ENOMEM;
 
-- 
2.30.0.296.g2bfb1c46d8-goog


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v4 2/4] mm: failfast mode with __GFP_NORETRY in alloc_contig_range
  2021-01-21 17:54 [PATCH v4 0/4] Chunk Heap Support on DMA-HEAP Minchan Kim
  2021-01-21 17:54 ` [PATCH v4 1/4] mm: cma: introduce gfp flag in cma_alloc instead of no_warn Minchan Kim
@ 2021-01-21 17:55 ` Minchan Kim
  2021-01-25 13:12   ` Michal Hocko
  2021-01-21 17:55 ` [PATCH v4 3/4] dt-bindings: reserved-memory: Make DMA-BUF CMA heap DT-configurable Minchan Kim
  2021-01-21 17:55 ` [PATCH v4 4/4] dma-buf: heaps: add chunk heap to dmabuf heaps Minchan Kim
  3 siblings, 1 reply; 31+ messages in thread
From: Minchan Kim @ 2021-01-21 17:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, hyesoo.yu, david, mhocko, surenb, pullip.cho,
	joaodias, hridya, john.stultz, sumit.semwal, linux-media,
	devicetree, hch, robh+dt, linaro-mm-sig, Minchan Kim

Contiguous memory allocation can be stalled due to waiting
on page writeback and/or page lock which causes unpredictable
delay. It's a unavoidable cost for the requestor to get *big*
contiguous memory but it's expensive for *small* contiguous
memory(e.g., order-4) because caller could retry the request
in different range where would have easy migratable pages
without stalling.

This patch introduce __GFP_NORETRY as compaction gfp_mask in
alloc_contig_range so it will fail fast without blocking
when it encounters pages needed waiting.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/page_alloc.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b031a5ae0bd5..1cdc3ee0b22e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8491,12 +8491,16 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
 	unsigned int nr_reclaimed;
 	unsigned long pfn = start;
 	unsigned int tries = 0;
+	unsigned int max_tries = 5;
 	int ret = 0;
 	struct migration_target_control mtc = {
 		.nid = zone_to_nid(cc->zone),
 		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
 	};
 
+	if (cc->alloc_contig && cc->mode == MIGRATE_ASYNC)
+		max_tries = 1;
+
 	migrate_prep();
 
 	while (pfn < end || !list_empty(&cc->migratepages)) {
@@ -8513,7 +8517,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
 				break;
 			}
 			tries = 0;
-		} else if (++tries == 5) {
+		} else if (++tries == max_tries) {
 			ret = ret < 0 ? ret : -EBUSY;
 			break;
 		}
@@ -8564,7 +8568,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 		.nr_migratepages = 0,
 		.order = -1,
 		.zone = page_zone(pfn_to_page(start)),
-		.mode = MIGRATE_SYNC,
+		.mode = gfp_mask & __GFP_NORETRY ? MIGRATE_ASYNC : MIGRATE_SYNC,
 		.ignore_skip_hint = true,
 		.no_set_skip_hint = true,
 		.gfp_mask = current_gfp_context(gfp_mask),
-- 
2.30.0.296.g2bfb1c46d8-goog


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v4 3/4] dt-bindings: reserved-memory: Make DMA-BUF CMA heap DT-configurable
  2021-01-21 17:54 [PATCH v4 0/4] Chunk Heap Support on DMA-HEAP Minchan Kim
  2021-01-21 17:54 ` [PATCH v4 1/4] mm: cma: introduce gfp flag in cma_alloc instead of no_warn Minchan Kim
  2021-01-21 17:55 ` [PATCH v4 2/4] mm: failfast mode with __GFP_NORETRY in alloc_contig_range Minchan Kim
@ 2021-01-21 17:55 ` Minchan Kim
  2021-01-26  7:07   ` John Stultz
  2021-02-05 22:55   ` Rob Herring
  2021-01-21 17:55 ` [PATCH v4 4/4] dma-buf: heaps: add chunk heap to dmabuf heaps Minchan Kim
  3 siblings, 2 replies; 31+ messages in thread
From: Minchan Kim @ 2021-01-21 17:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, hyesoo.yu, david, mhocko, surenb, pullip.cho,
	joaodias, hridya, john.stultz, sumit.semwal, linux-media,
	devicetree, hch, robh+dt, linaro-mm-sig, Minchan Kim

From: Hyesoo Yu <hyesoo.yu@samsung.com>

Document devicetree binding for chunk cma heap on dma heap framework.

The DMA chunk heap supports the bulk allocation of higher order pages.

The chunk heap's allocator allocates from the CMA area. It is optimized
to perform bulk allocation of higher order pages in an efficient manner.
For this purpose, the heap needs an exclusive CMA area that will only be
used for allocation by the heap. This is the reason why we need to use
the DT to create and configure a reserved memory region for use by the
chunk CMA heap driver. Since all allocation from DMA-BUF heaps happen
from the user-space, there is no other appropriate device-driver that
we can use to register the chunk CMA heap and configure the reserved
memory region for its use.

Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Hridya Valsaraju <hridya@google.com>
---
 .../reserved-memory/dma_heap_chunk.yaml       | 56 +++++++++++++++++++
 1 file changed, 56 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/reserved-memory/dma_heap_chunk.yaml

diff --git a/Documentation/devicetree/bindings/reserved-memory/dma_heap_chunk.yaml b/Documentation/devicetree/bindings/reserved-memory/dma_heap_chunk.yaml
new file mode 100644
index 000000000000..00db0ae6af61
--- /dev/null
+++ b/Documentation/devicetree/bindings/reserved-memory/dma_heap_chunk.yaml
@@ -0,0 +1,56 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/reserved-memory/dma_heap_chunk.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Device tree binding for chunk heap on DMA HEAP FRAMEWORK
+
+description: |
+  The DMA chunk heap is backed by the Contiguous Memory Allocator (CMA) and
+  supports bulk allocation of fixed size pages.
+
+maintainers:
+  - Hyesoo Yu <hyesoo.yu@samsung.com>
+  - John Stultz <john.stultz@linaro.org>
+  - Minchan Kim <minchan@kernel.org>
+  - Hridya Valsaraju<hridya@google.com>
+
+
+properties:
+  compatible:
+    enum:
+      - dma_heap,chunk
+
+  chunk-order:
+    description: |
+            order of pages that will get allocated from the chunk DMA heap.
+    maxItems: 1
+
+  size:
+    maxItems: 1
+
+  alignment:
+    maxItems: 1
+
+required:
+  - compatible
+  - size
+  - alignment
+  - chunk-order
+
+additionalProperties: false
+
+examples:
+  - |
+    reserved-memory {
+        #address-cells = <2>;
+        #size-cells = <1>;
+
+        chunk_memory: chunk_memory {
+            compatible = "dma_heap,chunk";
+            size = <0x3000000>;
+            alignment = <0x0 0x00010000>;
+            chunk-order = <4>;
+        };
+    };
-- 
2.30.0.296.g2bfb1c46d8-goog


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v4 4/4] dma-buf: heaps: add chunk heap to dmabuf heaps
  2021-01-21 17:54 [PATCH v4 0/4] Chunk Heap Support on DMA-HEAP Minchan Kim
                   ` (2 preceding siblings ...)
  2021-01-21 17:55 ` [PATCH v4 3/4] dt-bindings: reserved-memory: Make DMA-BUF CMA heap DT-configurable Minchan Kim
@ 2021-01-21 17:55 ` Minchan Kim
  2021-01-26  7:07   ` Christoph Hellwig
                     ` (2 more replies)
  3 siblings, 3 replies; 31+ messages in thread
From: Minchan Kim @ 2021-01-21 17:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, hyesoo.yu, david, mhocko, surenb, pullip.cho,
	joaodias, hridya, john.stultz, sumit.semwal, linux-media,
	devicetree, hch, robh+dt, linaro-mm-sig, Minchan Kim

From: Hyesoo Yu <hyesoo.yu@samsung.com>

This patch supports chunk heap that allocates the buffers that
arranged into a list a fixed size chunks taken from CMA.

The chunk heap driver is bound directly to a reserved_memory
node by following Rob Herring's suggestion in [1].

[1] https://lore.kernel.org/lkml/20191025225009.50305-2-john.stultz@linaro.org/T/#m3dc63acd33fea269a584f43bb799a876f0b2b45d

Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com>
Signed-off-by: Hridya Valsaraju <hridya@google.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 drivers/dma-buf/heaps/Kconfig      |   8 +
 drivers/dma-buf/heaps/Makefile     |   1 +
 drivers/dma-buf/heaps/chunk_heap.c | 492 +++++++++++++++++++++++++++++
 3 files changed, 501 insertions(+)
 create mode 100644 drivers/dma-buf/heaps/chunk_heap.c

diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig
index a5eef06c4226..e9595e26f831 100644
--- a/drivers/dma-buf/heaps/Kconfig
+++ b/drivers/dma-buf/heaps/Kconfig
@@ -12,3 +12,11 @@ config DMABUF_HEAPS_CMA
 	  Choose this option to enable dma-buf CMA heap. This heap is backed
 	  by the Contiguous Memory Allocator (CMA). If your system has these
 	  regions, you should say Y here.
+
+config DMABUF_HEAPS_CHUNK
+	bool "DMA-BUF CHUNK Heap"
+	depends on DMABUF_HEAPS && DMA_CMA
+	help
+	  Choose this option to enable dma-buf CHUNK heap. This heap is backed
+	  by the Contiguous Memory Allocator (CMA) and allocates the buffers that
+	  are arranged into a list of fixed size chunks taken from CMA.
diff --git a/drivers/dma-buf/heaps/Makefile b/drivers/dma-buf/heaps/Makefile
index 974467791032..8faa6cfdc0c5 100644
--- a/drivers/dma-buf/heaps/Makefile
+++ b/drivers/dma-buf/heaps/Makefile
@@ -1,3 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_DMABUF_HEAPS_SYSTEM)	+= system_heap.o
 obj-$(CONFIG_DMABUF_HEAPS_CMA)		+= cma_heap.o
+obj-$(CONFIG_DMABUF_HEAPS_CHUNK)	+= chunk_heap.o
diff --git a/drivers/dma-buf/heaps/chunk_heap.c b/drivers/dma-buf/heaps/chunk_heap.c
new file mode 100644
index 000000000000..15df42acee4b
--- /dev/null
+++ b/drivers/dma-buf/heaps/chunk_heap.c
@@ -0,0 +1,492 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * DMA-BUF chunk heap exporter
+ *
+ * Copyright (c) 2020 Samsung Electronics Co., Ltd.
+ * Author: <hyesoo.yu@samsung.com> for Samsung Electronics.
+ */
+
+#include <linux/cma.h>
+#include <linux/device.h>
+#include <linux/dma-buf.h>
+#include <linux/dma-heap.h>
+#include <linux/dma-mapping.h>
+#include <linux/dma-map-ops.h>
+#include <linux/err.h>
+#include <linux/errno.h>
+#include <linux/highmem.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_fdt.h>
+#include <linux/of_reserved_mem.h>
+#include <linux/scatterlist.h>
+#include <linux/sched/signal.h>
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+
+struct chunk_heap {
+	struct dma_heap *heap;
+	uint32_t order;
+	struct cma *cma;
+};
+
+struct chunk_heap_buffer {
+	struct chunk_heap *heap;
+	struct list_head attachments;
+	struct mutex lock;
+	struct sg_table sg_table;
+	unsigned long len;
+	int vmap_cnt;
+	void *vaddr;
+};
+
+struct chunk_heap_attachment {
+	struct device *dev;
+	struct sg_table *table;
+	struct list_head list;
+	bool mapped;
+};
+
+struct chunk_heap chunk_heaps[MAX_CMA_AREAS];
+unsigned int chunk_heap_count;
+
+static struct sg_table *dup_sg_table(struct sg_table *table)
+{
+	struct sg_table *new_table;
+	int ret, i;
+	struct scatterlist *sg, *new_sg;
+
+	new_table = kzalloc(sizeof(*new_table), GFP_KERNEL);
+	if (!new_table)
+		return ERR_PTR(-ENOMEM);
+
+	ret = sg_alloc_table(new_table, table->orig_nents, GFP_KERNEL);
+	if (ret) {
+		kfree(new_table);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	new_sg = new_table->sgl;
+	for_each_sgtable_sg(table, sg, i) {
+		sg_set_page(new_sg, sg_page(sg), sg->length, sg->offset);
+		new_sg = sg_next(new_sg);
+	}
+
+	return new_table;
+}
+
+static int chunk_heap_attach(struct dma_buf *dmabuf, struct dma_buf_attachment *attachment)
+{
+	struct chunk_heap_buffer *buffer = dmabuf->priv;
+	struct chunk_heap_attachment *a;
+	struct sg_table *table;
+
+	a = kzalloc(sizeof(*a), GFP_KERNEL);
+	if (!a)
+		return -ENOMEM;
+
+	table = dup_sg_table(&buffer->sg_table);
+	if (IS_ERR(table)) {
+		kfree(a);
+		return -ENOMEM;
+	}
+
+	a->table = table;
+	a->dev = attachment->dev;
+
+	attachment->priv = a;
+
+	mutex_lock(&buffer->lock);
+	list_add(&a->list, &buffer->attachments);
+	mutex_unlock(&buffer->lock);
+
+	return 0;
+}
+
+static void chunk_heap_detach(struct dma_buf *dmabuf, struct dma_buf_attachment *attachment)
+{
+	struct chunk_heap_buffer *buffer = dmabuf->priv;
+	struct chunk_heap_attachment *a = attachment->priv;
+
+	mutex_lock(&buffer->lock);
+	list_del(&a->list);
+	mutex_unlock(&buffer->lock);
+
+	sg_free_table(a->table);
+	kfree(a->table);
+	kfree(a);
+}
+
+static struct sg_table *chunk_heap_map_dma_buf(struct dma_buf_attachment *attachment,
+					       enum dma_data_direction direction)
+{
+	struct chunk_heap_attachment *a = attachment->priv;
+	struct sg_table *table = a->table;
+	int ret;
+
+	if (a->mapped)
+		return table;
+
+	ret = dma_map_sgtable(attachment->dev, table, direction, 0);
+	if (ret)
+		return ERR_PTR(ret);
+
+	a->mapped = true;
+	return table;
+}
+
+static void chunk_heap_unmap_dma_buf(struct dma_buf_attachment *attachment,
+				     struct sg_table *table,
+				     enum dma_data_direction direction)
+{
+	struct chunk_heap_attachment *a = attachment->priv;
+
+	a->mapped = false;
+	dma_unmap_sgtable(attachment->dev, table, direction, 0);
+}
+
+static int chunk_heap_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
+						enum dma_data_direction direction)
+{
+	struct chunk_heap_buffer *buffer = dmabuf->priv;
+	struct chunk_heap_attachment *a;
+
+	mutex_lock(&buffer->lock);
+
+	if (buffer->vmap_cnt)
+		invalidate_kernel_vmap_range(buffer->vaddr, buffer->len);
+
+	list_for_each_entry(a, &buffer->attachments, list) {
+		if (!a->mapped)
+			continue;
+		dma_sync_sgtable_for_cpu(a->dev, a->table, direction);
+	}
+	mutex_unlock(&buffer->lock);
+
+	return 0;
+}
+
+static int chunk_heap_dma_buf_end_cpu_access(struct dma_buf *dmabuf,
+					      enum dma_data_direction direction)
+{
+	struct chunk_heap_buffer *buffer = dmabuf->priv;
+	struct chunk_heap_attachment *a;
+
+	mutex_lock(&buffer->lock);
+
+	if (buffer->vmap_cnt)
+		flush_kernel_vmap_range(buffer->vaddr, buffer->len);
+
+	list_for_each_entry(a, &buffer->attachments, list) {
+		if (!a->mapped)
+			continue;
+		dma_sync_sgtable_for_device(a->dev, a->table, direction);
+	}
+	mutex_unlock(&buffer->lock);
+
+	return 0;
+}
+
+static int chunk_heap_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
+{
+	struct chunk_heap_buffer *buffer = dmabuf->priv;
+	struct sg_table *table = &buffer->sg_table;
+	unsigned long addr = vma->vm_start;
+	struct sg_page_iter piter;
+	int ret;
+
+	for_each_sgtable_page(table, &piter, vma->vm_pgoff) {
+		struct page *page = sg_page_iter_page(&piter);
+
+		ret = remap_pfn_range(vma, addr, page_to_pfn(page), PAGE_SIZE,
+				      vma->vm_page_prot);
+		if (ret)
+			return ret;
+		addr += PAGE_SIZE;
+		if (addr >= vma->vm_end)
+			return 0;
+	}
+	return 0;
+}
+
+static void *chunk_heap_do_vmap(struct chunk_heap_buffer *buffer)
+{
+	struct sg_table *table = &buffer->sg_table;
+	int npages = PAGE_ALIGN(buffer->len) / PAGE_SIZE;
+	struct page **pages = vmalloc(sizeof(struct page *) * npages);
+	struct page **tmp = pages;
+	struct sg_page_iter piter;
+	void *vaddr;
+
+	if (!pages)
+		return ERR_PTR(-ENOMEM);
+
+	for_each_sgtable_page(table, &piter, 0) {
+		WARN_ON(tmp - pages >= npages);
+		*tmp++ = sg_page_iter_page(&piter);
+	}
+
+	vaddr = vmap(pages, npages, VM_MAP, PAGE_KERNEL);
+	vfree(pages);
+
+	if (!vaddr)
+		return ERR_PTR(-ENOMEM);
+
+	return vaddr;
+}
+
+static int chunk_heap_vmap(struct dma_buf *dmabuf, struct dma_buf_map *map)
+{
+	struct chunk_heap_buffer *buffer = dmabuf->priv;
+	void *vaddr;
+
+	mutex_lock(&buffer->lock);
+	if (buffer->vmap_cnt) {
+		vaddr = buffer->vaddr;
+	} else {
+		vaddr = chunk_heap_do_vmap(buffer);
+		if (IS_ERR(vaddr)) {
+			mutex_unlock(&buffer->lock);
+
+			return PTR_ERR(vaddr);
+		}
+		buffer->vaddr = vaddr;
+	}
+	buffer->vmap_cnt++;
+	dma_buf_map_set_vaddr(map, vaddr);
+
+	mutex_unlock(&buffer->lock);
+
+	return 0;
+}
+
+static void chunk_heap_vunmap(struct dma_buf *dmabuf, struct dma_buf_map *map)
+{
+	struct chunk_heap_buffer *buffer = dmabuf->priv;
+
+	mutex_lock(&buffer->lock);
+	if (!--buffer->vmap_cnt) {
+		vunmap(buffer->vaddr);
+		buffer->vaddr = NULL;
+	}
+	mutex_unlock(&buffer->lock);
+}
+
+static void chunk_heap_dma_buf_release(struct dma_buf *dmabuf)
+{
+	struct chunk_heap_buffer *buffer = dmabuf->priv;
+	struct chunk_heap *chunk_heap = buffer->heap;
+	struct sg_table *table;
+	struct scatterlist *sg;
+	int i;
+
+	table = &buffer->sg_table;
+	for_each_sgtable_sg(table, sg, i)
+		cma_release(chunk_heap->cma, sg_page(sg), 1 << chunk_heap->order);
+	sg_free_table(table);
+	kfree(buffer);
+}
+
+static const struct dma_buf_ops chunk_heap_buf_ops = {
+	.attach = chunk_heap_attach,
+	.detach = chunk_heap_detach,
+	.map_dma_buf = chunk_heap_map_dma_buf,
+	.unmap_dma_buf = chunk_heap_unmap_dma_buf,
+	.begin_cpu_access = chunk_heap_dma_buf_begin_cpu_access,
+	.end_cpu_access = chunk_heap_dma_buf_end_cpu_access,
+	.mmap = chunk_heap_mmap,
+	.vmap = chunk_heap_vmap,
+	.vunmap = chunk_heap_vunmap,
+	.release = chunk_heap_dma_buf_release,
+};
+
+static int chunk_heap_allocate(struct dma_heap *heap, unsigned long len,
+			       unsigned long fd_flags, unsigned long heap_flags)
+{
+	struct chunk_heap *chunk_heap = dma_heap_get_drvdata(heap);
+	struct chunk_heap_buffer *buffer;
+	DEFINE_DMA_BUF_EXPORT_INFO(exp_info);
+	struct dma_buf *dmabuf;
+	struct sg_table *table;
+	struct scatterlist *sg;
+	struct page **pages;
+	unsigned int chunk_size = PAGE_SIZE << chunk_heap->order;
+	unsigned int count, alloced = 0;
+	unsigned int alloc_order = max_t(unsigned int, pageblock_order, chunk_heap->order);
+	unsigned int nr_chunks_per_alloc = 1 << (alloc_order - chunk_heap->order);
+	gfp_t gfp_flags = GFP_KERNEL|__GFP_NORETRY;
+	int ret = -ENOMEM;
+	pgoff_t pg;
+
+	buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
+	if (!buffer)
+		return ret;
+
+	INIT_LIST_HEAD(&buffer->attachments);
+	mutex_init(&buffer->lock);
+	buffer->heap = chunk_heap;
+	buffer->len = ALIGN(len, chunk_size);
+	count = buffer->len / chunk_size;
+
+	pages = kvmalloc_array(count, sizeof(*pages), GFP_KERNEL);
+	if (!pages)
+		goto err_pages;
+
+	while (alloced < count) {
+		struct page *page;
+		int i;
+
+		while (count - alloced < nr_chunks_per_alloc) {
+			alloc_order--;
+			nr_chunks_per_alloc >>= 1;
+		}
+
+		page = cma_alloc(chunk_heap->cma, 1 << alloc_order,
+					alloc_order, gfp_flags);
+		if (!page) {
+			if (gfp_flags & __GFP_NORETRY) {
+				gfp_flags &= ~__GFP_NORETRY;
+				continue;
+			}
+			break;
+		}
+
+		for (i = 0; i < nr_chunks_per_alloc; i++, alloced++) {
+			pages[alloced] = page;
+			page += 1 << chunk_heap->order;
+		}
+	}
+
+	if (alloced < count)
+		goto err_alloc;
+
+	table = &buffer->sg_table;
+	if (sg_alloc_table(table, count, GFP_KERNEL))
+		goto err_alloc;
+
+	sg = table->sgl;
+	for (pg = 0; pg < count; pg++) {
+		sg_set_page(sg, pages[pg], chunk_size, 0);
+		sg = sg_next(sg);
+	}
+
+	exp_info.ops = &chunk_heap_buf_ops;
+	exp_info.size = buffer->len;
+	exp_info.flags = fd_flags;
+	exp_info.priv = buffer;
+	dmabuf = dma_buf_export(&exp_info);
+	if (IS_ERR(dmabuf)) {
+		ret = PTR_ERR(dmabuf);
+		goto err_export;
+	}
+	kvfree(pages);
+
+	ret = dma_buf_fd(dmabuf, fd_flags);
+	if (ret < 0) {
+		dma_buf_put(dmabuf);
+		return ret;
+	}
+
+	return 0;
+err_export:
+	sg_free_table(table);
+err_alloc:
+	for (pg = 0; pg < alloced; pg++)
+		cma_release(chunk_heap->cma, pages[pg], 1 << chunk_heap->order);
+	kvfree(pages);
+err_pages:
+	kfree(buffer);
+
+	return ret;
+}
+
+static const struct dma_heap_ops chunk_heap_ops = {
+	.allocate = chunk_heap_allocate,
+};
+
+#define CHUNK_PREFIX "chunk-"
+
+static int register_chunk_heap(struct chunk_heap *chunk_heap_info)
+{
+	struct dma_heap_export_info exp_info;
+	const char *name = cma_get_name(chunk_heap_info->cma);
+	size_t len = strlen(CHUNK_PREFIX) + strlen(name) + 1;
+	char *buf = kmalloc(len, GFP_KERNEL);
+
+	if (!buf)
+		return -ENOMEM;
+
+	sprintf(buf, CHUNK_PREFIX"%s", cma_get_name(chunk_heap_info->cma));
+	buf[len] = '\0';
+
+	exp_info.name = buf;
+	exp_info.name = cma_get_name(chunk_heap_info->cma);
+	exp_info.ops = &chunk_heap_ops;
+	exp_info.priv = chunk_heap_info;
+
+	chunk_heap_info->heap = dma_heap_add(&exp_info);
+	if (IS_ERR(chunk_heap_info->heap)) {
+		kfree(buf);
+		return PTR_ERR(chunk_heap_info->heap);
+	}
+
+	return 0;
+}
+
+static int __init chunk_heap_init(void)
+{
+	unsigned int i;
+
+	for (i = 0; i < chunk_heap_count; i++)
+		register_chunk_heap(&chunk_heaps[i]);
+
+	return 0;
+}
+module_init(chunk_heap_init);
+
+#ifdef CONFIG_OF_EARLY_FLATTREE
+
+static int __init dmabuf_chunk_heap_area_init(struct reserved_mem *rmem)
+{
+	int ret;
+	struct cma *cma;
+	struct chunk_heap *chunk_heap_info;
+	const __be32 *chunk_order;
+
+	phys_addr_t align = PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order);
+	phys_addr_t mask = align - 1;
+
+	if ((rmem->base & mask) || (rmem->size & mask)) {
+		pr_err("Incorrect alignment for CMA region\n");
+		return -EINVAL;
+	}
+
+	ret = cma_init_reserved_mem(rmem->base, rmem->size, 0, rmem->name, &cma);
+	if (ret) {
+		pr_err("Reserved memory: unable to setup CMA region\n");
+		return ret;
+	}
+
+	/* Architecture specific contiguous memory fixup. */
+	dma_contiguous_early_fixup(rmem->base, rmem->size);
+
+	chunk_heap_info = &chunk_heaps[chunk_heap_count];
+	chunk_heap_info->cma = cma;
+
+	chunk_order = of_get_flat_dt_prop(rmem->fdt_node, "chunk-order", NULL);
+
+	if (chunk_order)
+		chunk_heap_info->order = be32_to_cpu(*chunk_order);
+	else
+		chunk_heap_info->order = 4;
+
+	chunk_heap_count++;
+
+	return 0;
+}
+RESERVEDMEM_OF_DECLARE(dmabuf_chunk_heap, "dma_heap,chunk",
+		       dmabuf_chunk_heap_area_init);
+#endif
+
+MODULE_DESCRIPTION("DMA-BUF Chunk Heap");
+MODULE_LICENSE("GPL v2");
-- 
2.30.0.296.g2bfb1c46d8-goog


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/4] mm: cma: introduce gfp flag in cma_alloc instead of no_warn
  2021-01-21 17:54 ` [PATCH v4 1/4] mm: cma: introduce gfp flag in cma_alloc instead of no_warn Minchan Kim
@ 2021-01-21 18:46   ` David Hildenbrand
  2021-01-21 18:50   ` Minchan Kim
  2021-01-25 13:07   ` Michal Hocko
  2 siblings, 0 replies; 31+ messages in thread
From: David Hildenbrand @ 2021-01-21 18:46 UTC (permalink / raw)
  To: Minchan Kim, Andrew Morton
  Cc: linux-mm, LKML, hyesoo.yu, mhocko, surenb, pullip.cho, joaodias,
	hridya, john.stultz, sumit.semwal, linux-media, devicetree, hch,
	robh+dt, linaro-mm-sig

On 21.01.21 18:54, Minchan Kim wrote:
> The upcoming patch will introduce __GFP_NORETRY semantic
> in alloc_contig_range which is a failfast mode of the API.
> Instead of adding a additional parameter for gfp, replace
> no_warn with gfp flag.
> 
> To keep old behaviors, it follows the rule below.
> 
>   no_warn 			gfp_flags
> 
>   false         		GFP_KERNEL
>   true          		GFP_KERNEL|__GFP_NOWARN
>   gfp & __GFP_NOWARN		GFP_KERNEL | (gfp & __GFP_NOWARN)
> 
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  drivers/dma-buf/heaps/cma_heap.c |  2 +-
>  drivers/s390/char/vmcp.c         |  2 +-
>  include/linux/cma.h              |  2 +-
>  kernel/dma/contiguous.c          |  3 ++-
>  mm/cma.c                         | 12 ++++++------
>  mm/cma_debug.c                   |  2 +-
>  mm/hugetlb.c                     |  6 ++++--
>  mm/secretmem.c                   |  3 ++-
>  8 files changed, 18 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c
> index 364fc2f3e499..0afc1907887a 100644
> --- a/drivers/dma-buf/heaps/cma_heap.c
> +++ b/drivers/dma-buf/heaps/cma_heap.c
> @@ -298,7 +298,7 @@ static int cma_heap_allocate(struct dma_heap *heap,
>  	if (align > CONFIG_CMA_ALIGNMENT)
>  		align = CONFIG_CMA_ALIGNMENT;
>  
> -	cma_pages = cma_alloc(cma_heap->cma, pagecount, align, false);
> +	cma_pages = cma_alloc(cma_heap->cma, pagecount, align, GFP_KERNEL);
>  	if (!cma_pages)
>  		goto free_buffer;
>  
> diff --git a/drivers/s390/char/vmcp.c b/drivers/s390/char/vmcp.c
> index 9e066281e2d0..78f9adf56456 100644
> --- a/drivers/s390/char/vmcp.c
> +++ b/drivers/s390/char/vmcp.c
> @@ -70,7 +70,7 @@ static void vmcp_response_alloc(struct vmcp_session *session)
>  	 * anymore the system won't work anyway.
>  	 */
>  	if (order > 2)
> -		page = cma_alloc(vmcp_cma, nr_pages, 0, false);
> +		page = cma_alloc(vmcp_cma, nr_pages, 0, GFP_KERNEL);
>  	if (page) {
>  		session->response = (char *)page_to_phys(page);
>  		session->cma_alloc = 1;
> diff --git a/include/linux/cma.h b/include/linux/cma.h
> index 217999c8a762..d6c02d08ddbc 100644
> --- a/include/linux/cma.h
> +++ b/include/linux/cma.h
> @@ -45,7 +45,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
>  					const char *name,
>  					struct cma **res_cma);
>  extern struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
> -			      bool no_warn);
> +			      gfp_t gfp_mask);
>  extern bool cma_release(struct cma *cma, const struct page *pages, unsigned int count);
>  
>  extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data);
> diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> index 3d63d91cba5c..552ed531c018 100644
> --- a/kernel/dma/contiguous.c
> +++ b/kernel/dma/contiguous.c
> @@ -260,7 +260,8 @@ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
>  	if (align > CONFIG_CMA_ALIGNMENT)
>  		align = CONFIG_CMA_ALIGNMENT;
>  
> -	return cma_alloc(dev_get_cma_area(dev), count, align, no_warn);
> +	return cma_alloc(dev_get_cma_area(dev), count, align, GFP_KERNEL |
> +			(no_warn ? __GFP_NOWARN : 0));
>  }
>  
>  /**
> diff --git a/mm/cma.c b/mm/cma.c
> index 0ba69cd16aeb..d50627686fec 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -419,13 +419,13 @@ static inline void cma_debug_show_areas(struct cma *cma) { }
>   * @cma:   Contiguous memory region for which the allocation is performed.
>   * @count: Requested number of pages.
>   * @align: Requested alignment of pages (in PAGE_SIZE order).
> - * @no_warn: Avoid printing message about failed allocation
> + * @gfp_mask: GFP mask to use during the cma allocation.
>   *
>   * This function allocates part of contiguous memory on specific
>   * contiguous memory area.
>   */
>  struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
> -		       bool no_warn)
> +		       gfp_t gfp_mask)
>  {
>  	unsigned long mask, offset;
>  	unsigned long pfn = -1;
> @@ -438,8 +438,8 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
>  	if (!cma || !cma->count || !cma->bitmap)
>  		return NULL;
>  
> -	pr_debug("%s(cma %p, count %zu, align %d)\n", __func__, (void *)cma,
> -		 count, align);
> +	pr_debug("%s(cma %p, count %zu, align %d gfp_mask 0x%x)\n", __func__,
> +			(void *)cma, count, align, gfp_mask);
>  
>  	if (!count)
>  		return NULL;
> @@ -471,7 +471,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
>  
>  		pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
>  		ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA,
> -				     GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0));
> +						gfp_mask);
>  
>  		if (ret == 0) {
>  			page = pfn_to_page(pfn);
> @@ -500,7 +500,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
>  			page_kasan_tag_reset(page + i);
>  	}
>  
> -	if (ret && !no_warn) {
> +	if (ret && !(gfp_mask & __GFP_NOWARN)) {
>  		pr_err("%s: alloc failed, req-size: %zu pages, ret: %d\n",
>  			__func__, count, ret);
>  		cma_debug_show_areas(cma);
> diff --git a/mm/cma_debug.c b/mm/cma_debug.c
> index d5bf8aa34fdc..00170c41cf81 100644
> --- a/mm/cma_debug.c
> +++ b/mm/cma_debug.c
> @@ -137,7 +137,7 @@ static int cma_alloc_mem(struct cma *cma, int count)
>  	if (!mem)
>  		return -ENOMEM;
>  
> -	p = cma_alloc(cma, count, 0, false);
> +	p = cma_alloc(cma, count, 0, GFP_KERNEL);
>  	if (!p) {
>  		kfree(mem);
>  		return -ENOMEM;
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index a6bad1f686c5..4209a2ed1e1b 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1266,7 +1266,8 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
>  
>  		if (hugetlb_cma[nid]) {
>  			page = cma_alloc(hugetlb_cma[nid], nr_pages,
> -					huge_page_order(h), true);
> +					huge_page_order(h),
> +					GFP_KERNEL | __GFP_NOWARN);
>  			if (page)
>  				return page;
>  		}
> @@ -1277,7 +1278,8 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
>  					continue;
>  
>  				page = cma_alloc(hugetlb_cma[node], nr_pages,
> -						huge_page_order(h), true);
> +						huge_page_order(h),
> +						GFP_KERNEL | __GFP_NOWARN);
>  				if (page)
>  					return page;
>  			}
> diff --git a/mm/secretmem.c b/mm/secretmem.c
> index b8a32954ac68..585d55b9f9d8 100644
> --- a/mm/secretmem.c
> +++ b/mm/secretmem.c
> @@ -86,7 +86,8 @@ static int secretmem_pool_increase(struct secretmem_ctx *ctx, gfp_t gfp)
>  	struct page *page;
>  	int err;
>  
> -	page = cma_alloc(secretmem_cma, nr_pages, PMD_SIZE, gfp & __GFP_NOWARN);
> +	page = cma_alloc(secretmem_cma, nr_pages, PMD_SIZE,
> +				GFP_KERNEL | (gfp & __GFP_NOWARN));
>  	if (!page)
>  		return -ENOMEM;
>  
> 

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/4] mm: cma: introduce gfp flag in cma_alloc instead of no_warn
  2021-01-21 17:54 ` [PATCH v4 1/4] mm: cma: introduce gfp flag in cma_alloc instead of no_warn Minchan Kim
  2021-01-21 18:46   ` David Hildenbrand
@ 2021-01-21 18:50   ` Minchan Kim
  2021-01-25 13:07   ` Michal Hocko
  2 siblings, 0 replies; 31+ messages in thread
From: Minchan Kim @ 2021-01-21 18:50 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, LKML, hyesoo.yu, david, mhocko, surenb, pullip.cho,
	joaodias, hridya, john.stultz, sumit.semwal, linux-media,
	devicetree, hch, robh+dt, linaro-mm-sig

On Thu, Jan 21, 2021 at 09:54:59AM -0800, Minchan Kim wrote:
> The upcoming patch will introduce __GFP_NORETRY semantic
> in alloc_contig_range which is a failfast mode of the API.
> Instead of adding a additional parameter for gfp, replace
> no_warn with gfp flag.
> 
> To keep old behaviors, it follows the rule below.
> 
>   no_warn 			gfp_flags
> 
>   false         		GFP_KERNEL
>   true          		GFP_KERNEL|__GFP_NOWARN
>   gfp & __GFP_NOWARN		GFP_KERNEL | (gfp & __GFP_NOWARN)
> 
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>

Found one missing piece : cma_alloc_alinged

Resend with fixing


From 54f5de059636d2178bf5f716239a4a1ea9cbdc52 Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan@kernel.org>
Date: Mon, 21 Dec 2020 17:55:37 -0800
Subject: [PATCH v4 1/4] mm: cma: introduce gfp flag in cma_alloc instead

The upcoming patch will introduce __GFP_NORETRY semantic
in alloc_contig_range which is a failfast mode of the API.
Instead of adding a additional parameter for gfp, replace
no_warn with gfp flag.

To keep old behaviors, it follows the rule below.

  no_warn 			gfp_flags

  false         		GFP_KERNEL
  true          		GFP_KERNEL|__GFP_NOWARN
  gfp & __GFP_NOWARN		GFP_KERNEL | (gfp & __GFP_NOWARN)

Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 drivers/dma-buf/heaps/cma_heap.c |  2 +-
 drivers/s390/char/vmcp.c         |  2 +-
 include/linux/cma.h              |  2 +-
 kernel/dma/contiguous.c          |  6 ++++--
 mm/cma.c                         | 12 ++++++------
 mm/cma_debug.c                   |  2 +-
 mm/hugetlb.c                     |  6 ++++--
 mm/secretmem.c                   |  3 ++-
 8 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c
index 364fc2f3e499..0afc1907887a 100644
--- a/drivers/dma-buf/heaps/cma_heap.c
+++ b/drivers/dma-buf/heaps/cma_heap.c
@@ -298,7 +298,7 @@ static int cma_heap_allocate(struct dma_heap *heap,
 	if (align > CONFIG_CMA_ALIGNMENT)
 		align = CONFIG_CMA_ALIGNMENT;
 
-	cma_pages = cma_alloc(cma_heap->cma, pagecount, align, false);
+	cma_pages = cma_alloc(cma_heap->cma, pagecount, align, GFP_KERNEL);
 	if (!cma_pages)
 		goto free_buffer;
 
diff --git a/drivers/s390/char/vmcp.c b/drivers/s390/char/vmcp.c
index 9e066281e2d0..78f9adf56456 100644
--- a/drivers/s390/char/vmcp.c
+++ b/drivers/s390/char/vmcp.c
@@ -70,7 +70,7 @@ static void vmcp_response_alloc(struct vmcp_session *session)
 	 * anymore the system won't work anyway.
 	 */
 	if (order > 2)
-		page = cma_alloc(vmcp_cma, nr_pages, 0, false);
+		page = cma_alloc(vmcp_cma, nr_pages, 0, GFP_KERNEL);
 	if (page) {
 		session->response = (char *)page_to_phys(page);
 		session->cma_alloc = 1;
diff --git a/include/linux/cma.h b/include/linux/cma.h
index 217999c8a762..d6c02d08ddbc 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -45,7 +45,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
 					const char *name,
 					struct cma **res_cma);
 extern struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
-			      bool no_warn);
+			      gfp_t gfp_mask);
 extern bool cma_release(struct cma *cma, const struct page *pages, unsigned int count);
 
 extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data);
diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index 3d63d91cba5c..f5a6fcaa9876 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -260,7 +260,8 @@ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
 	if (align > CONFIG_CMA_ALIGNMENT)
 		align = CONFIG_CMA_ALIGNMENT;
 
-	return cma_alloc(dev_get_cma_area(dev), count, align, no_warn);
+	return cma_alloc(dev_get_cma_area(dev), count, align, GFP_KERNEL |
+			(no_warn ? __GFP_NOWARN : 0));
 }
 
 /**
@@ -283,7 +284,8 @@ static struct page *cma_alloc_aligned(struct cma *cma, size_t size, gfp_t gfp)
 {
 	unsigned int align = min(get_order(size), CONFIG_CMA_ALIGNMENT);
 
-	return cma_alloc(cma, size >> PAGE_SHIFT, align, gfp & __GFP_NOWARN);
+	return cma_alloc(cma, size >> PAGE_SHIFT, align,
+				GFP_KERNEL | (gfp & __GFP_NOWARN));
 }
 
 /**
diff --git a/mm/cma.c b/mm/cma.c
index 0ba69cd16aeb..d50627686fec 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -419,13 +419,13 @@ static inline void cma_debug_show_areas(struct cma *cma) { }
  * @cma:   Contiguous memory region for which the allocation is performed.
  * @count: Requested number of pages.
  * @align: Requested alignment of pages (in PAGE_SIZE order).
- * @no_warn: Avoid printing message about failed allocation
+ * @gfp_mask: GFP mask to use during the cma allocation.
  *
  * This function allocates part of contiguous memory on specific
  * contiguous memory area.
  */
 struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
-		       bool no_warn)
+		       gfp_t gfp_mask)
 {
 	unsigned long mask, offset;
 	unsigned long pfn = -1;
@@ -438,8 +438,8 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 	if (!cma || !cma->count || !cma->bitmap)
 		return NULL;
 
-	pr_debug("%s(cma %p, count %zu, align %d)\n", __func__, (void *)cma,
-		 count, align);
+	pr_debug("%s(cma %p, count %zu, align %d gfp_mask 0x%x)\n", __func__,
+			(void *)cma, count, align, gfp_mask);
 
 	if (!count)
 		return NULL;
@@ -471,7 +471,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 
 		pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
 		ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA,
-				     GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0));
+						gfp_mask);
 
 		if (ret == 0) {
 			page = pfn_to_page(pfn);
@@ -500,7 +500,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 			page_kasan_tag_reset(page + i);
 	}
 
-	if (ret && !no_warn) {
+	if (ret && !(gfp_mask & __GFP_NOWARN)) {
 		pr_err("%s: alloc failed, req-size: %zu pages, ret: %d\n",
 			__func__, count, ret);
 		cma_debug_show_areas(cma);
diff --git a/mm/cma_debug.c b/mm/cma_debug.c
index d5bf8aa34fdc..00170c41cf81 100644
--- a/mm/cma_debug.c
+++ b/mm/cma_debug.c
@@ -137,7 +137,7 @@ static int cma_alloc_mem(struct cma *cma, int count)
 	if (!mem)
 		return -ENOMEM;
 
-	p = cma_alloc(cma, count, 0, false);
+	p = cma_alloc(cma, count, 0, GFP_KERNEL);
 	if (!p) {
 		kfree(mem);
 		return -ENOMEM;
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a6bad1f686c5..4209a2ed1e1b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1266,7 +1266,8 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
 
 		if (hugetlb_cma[nid]) {
 			page = cma_alloc(hugetlb_cma[nid], nr_pages,
-					huge_page_order(h), true);
+					huge_page_order(h),
+					GFP_KERNEL | __GFP_NOWARN);
 			if (page)
 				return page;
 		}
@@ -1277,7 +1278,8 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
 					continue;
 
 				page = cma_alloc(hugetlb_cma[node], nr_pages,
-						huge_page_order(h), true);
+						huge_page_order(h),
+						GFP_KERNEL | __GFP_NOWARN);
 				if (page)
 					return page;
 			}
diff --git a/mm/secretmem.c b/mm/secretmem.c
index b8a32954ac68..585d55b9f9d8 100644
--- a/mm/secretmem.c
+++ b/mm/secretmem.c
@@ -86,7 +86,8 @@ static int secretmem_pool_increase(struct secretmem_ctx *ctx, gfp_t gfp)
 	struct page *page;
 	int err;
 
-	page = cma_alloc(secretmem_cma, nr_pages, PMD_SIZE, gfp & __GFP_NOWARN);
+	page = cma_alloc(secretmem_cma, nr_pages, PMD_SIZE,
+				GFP_KERNEL | (gfp & __GFP_NOWARN));
 	if (!page)
 		return -ENOMEM;
 
-- 
2.30.0.296.g2bfb1c46d8-goog



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/4] mm: cma: introduce gfp flag in cma_alloc instead of no_warn
  2021-01-21 17:54 ` [PATCH v4 1/4] mm: cma: introduce gfp flag in cma_alloc instead of no_warn Minchan Kim
  2021-01-21 18:46   ` David Hildenbrand
  2021-01-21 18:50   ` Minchan Kim
@ 2021-01-25 13:07   ` Michal Hocko
  2021-01-25 19:42     ` Minchan Kim
  2 siblings, 1 reply; 31+ messages in thread
From: Michal Hocko @ 2021-01-25 13:07 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, hch, robh+dt, linaro-mm-sig

On Thu 21-01-21 09:54:59, Minchan Kim wrote:
> The upcoming patch will introduce __GFP_NORETRY semantic
> in alloc_contig_range which is a failfast mode of the API.
> Instead of adding a additional parameter for gfp, replace
> no_warn with gfp flag.
> 
> To keep old behaviors, it follows the rule below.
> 
>   no_warn 			gfp_flags
> 
>   false         		GFP_KERNEL
>   true          		GFP_KERNEL|__GFP_NOWARN
>   gfp & __GFP_NOWARN		GFP_KERNEL | (gfp & __GFP_NOWARN)
> 
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
[...]
> diff --git a/mm/cma.c b/mm/cma.c
> index 0ba69cd16aeb..d50627686fec 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -419,13 +419,13 @@ static inline void cma_debug_show_areas(struct cma *cma) { }
>   * @cma:   Contiguous memory region for which the allocation is performed.
>   * @count: Requested number of pages.
>   * @align: Requested alignment of pages (in PAGE_SIZE order).
> - * @no_warn: Avoid printing message about failed allocation
> + * @gfp_mask: GFP mask to use during the cma allocation.

Call out supported gfp flags explicitly. Have a look at kvmalloc_node
for a guidance.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 2/4] mm: failfast mode with __GFP_NORETRY in alloc_contig_range
  2021-01-21 17:55 ` [PATCH v4 2/4] mm: failfast mode with __GFP_NORETRY in alloc_contig_range Minchan Kim
@ 2021-01-25 13:12   ` Michal Hocko
  2021-01-25 13:13     ` Michal Hocko
  2021-01-25 19:33     ` Minchan Kim
  0 siblings, 2 replies; 31+ messages in thread
From: Michal Hocko @ 2021-01-25 13:12 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, hch, robh+dt, linaro-mm-sig

On Thu 21-01-21 09:55:00, Minchan Kim wrote:
> Contiguous memory allocation can be stalled due to waiting
> on page writeback and/or page lock which causes unpredictable
> delay. It's a unavoidable cost for the requestor to get *big*
> contiguous memory but it's expensive for *small* contiguous
> memory(e.g., order-4) because caller could retry the request
> in different range where would have easy migratable pages
> without stalling.
> 
> This patch introduce __GFP_NORETRY as compaction gfp_mask in
> alloc_contig_range so it will fail fast without blocking
> when it encounters pages needed waiting.

I am not against controling how hard this allocator tries with gfp mask
but this changelog is rather void on any data and any user.

It is also rather dubious to have retries when then caller says to not
retry.

Also why didn't you consider GFP_NOWAIT semantic for non blocking mode?

> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  mm/page_alloc.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index b031a5ae0bd5..1cdc3ee0b22e 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8491,12 +8491,16 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
>  	unsigned int nr_reclaimed;
>  	unsigned long pfn = start;
>  	unsigned int tries = 0;
> +	unsigned int max_tries = 5;
>  	int ret = 0;
>  	struct migration_target_control mtc = {
>  		.nid = zone_to_nid(cc->zone),
>  		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
>  	};
>  
> +	if (cc->alloc_contig && cc->mode == MIGRATE_ASYNC)
> +		max_tries = 1;
> +
>  	migrate_prep();
>  
>  	while (pfn < end || !list_empty(&cc->migratepages)) {
> @@ -8513,7 +8517,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
>  				break;
>  			}
>  			tries = 0;
> -		} else if (++tries == 5) {
> +		} else if (++tries == max_tries) {
>  			ret = ret < 0 ? ret : -EBUSY;
>  			break;
>  		}
> @@ -8564,7 +8568,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
>  		.nr_migratepages = 0,
>  		.order = -1,
>  		.zone = page_zone(pfn_to_page(start)),
> -		.mode = MIGRATE_SYNC,
> +		.mode = gfp_mask & __GFP_NORETRY ? MIGRATE_ASYNC : MIGRATE_SYNC,
>  		.ignore_skip_hint = true,
>  		.no_set_skip_hint = true,
>  		.gfp_mask = current_gfp_context(gfp_mask),
> -- 
> 2.30.0.296.g2bfb1c46d8-goog

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 2/4] mm: failfast mode with __GFP_NORETRY in alloc_contig_range
  2021-01-25 13:12   ` Michal Hocko
@ 2021-01-25 13:13     ` Michal Hocko
  2021-01-25 19:33     ` Minchan Kim
  1 sibling, 0 replies; 31+ messages in thread
From: Michal Hocko @ 2021-01-25 13:13 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, hch, robh+dt, linaro-mm-sig

On Mon 25-01-21 14:12:02, Michal Hocko wrote:
> On Thu 21-01-21 09:55:00, Minchan Kim wrote:
> > Contiguous memory allocation can be stalled due to waiting
> > on page writeback and/or page lock which causes unpredictable
> > delay. It's a unavoidable cost for the requestor to get *big*
> > contiguous memory but it's expensive for *small* contiguous
> > memory(e.g., order-4) because caller could retry the request
> > in different range where would have easy migratable pages
> > without stalling.
> > 
> > This patch introduce __GFP_NORETRY as compaction gfp_mask in
> > alloc_contig_range so it will fail fast without blocking
> > when it encounters pages needed waiting.
> 
> I am not against controling how hard this allocator tries with gfp mask
> but this changelog is rather void on any data and any user.

OK, I can see that a user is in the last patch.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 2/4] mm: failfast mode with __GFP_NORETRY in alloc_contig_range
  2021-01-25 13:12   ` Michal Hocko
  2021-01-25 13:13     ` Michal Hocko
@ 2021-01-25 19:33     ` Minchan Kim
  2021-01-26  7:44       ` Michal Hocko
  1 sibling, 1 reply; 31+ messages in thread
From: Minchan Kim @ 2021-01-25 19:33 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, hch, robh+dt, linaro-mm-sig

On Mon, Jan 25, 2021 at 02:12:00PM +0100, Michal Hocko wrote:
> On Thu 21-01-21 09:55:00, Minchan Kim wrote:
> > Contiguous memory allocation can be stalled due to waiting
> > on page writeback and/or page lock which causes unpredictable
> > delay. It's a unavoidable cost for the requestor to get *big*
> > contiguous memory but it's expensive for *small* contiguous
> > memory(e.g., order-4) because caller could retry the request
> > in different range where would have easy migratable pages
> > without stalling.
> > 
> > This patch introduce __GFP_NORETRY as compaction gfp_mask in
> > alloc_contig_range so it will fail fast without blocking
> > when it encounters pages needed waiting.
> 
> I am not against controling how hard this allocator tries with gfp mask
> but this changelog is rather void on any data and any user.
> 
> It is also rather dubious to have retries when then caller says to not
> retry.

Since max_tries is 1 with ++tries, it shouldn't retry.

> 
> Also why didn't you consider GFP_NOWAIT semantic for non blocking mode?

GFP_NOWAIT seems to be low(specific) flags rather than the one I want to
express. Even though I said only page writeback/lock in the description,
the goal is to avoid costly operations we might find later so such
"failfast", I thought GFP_NORETRY would be good fit.

> 
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> >  mm/page_alloc.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index b031a5ae0bd5..1cdc3ee0b22e 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -8491,12 +8491,16 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
> >  	unsigned int nr_reclaimed;
> >  	unsigned long pfn = start;
> >  	unsigned int tries = 0;
> > +	unsigned int max_tries = 5;
> >  	int ret = 0;
> >  	struct migration_target_control mtc = {
> >  		.nid = zone_to_nid(cc->zone),
> >  		.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
> >  	};
> >  
> > +	if (cc->alloc_contig && cc->mode == MIGRATE_ASYNC)
> > +		max_tries = 1;
> > +
> >  	migrate_prep();
> >  
> >  	while (pfn < end || !list_empty(&cc->migratepages)) {
> > @@ -8513,7 +8517,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
> >  				break;
> >  			}
> >  			tries = 0;
> > -		} else if (++tries == 5) {
> > +		} else if (++tries == max_tries) {
> >  			ret = ret < 0 ? ret : -EBUSY;
> >  			break;
> >  		}
> > @@ -8564,7 +8568,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
> >  		.nr_migratepages = 0,
> >  		.order = -1,
> >  		.zone = page_zone(pfn_to_page(start)),
> > -		.mode = MIGRATE_SYNC,
> > +		.mode = gfp_mask & __GFP_NORETRY ? MIGRATE_ASYNC : MIGRATE_SYNC,
> >  		.ignore_skip_hint = true,
> >  		.no_set_skip_hint = true,
> >  		.gfp_mask = current_gfp_context(gfp_mask),
> > -- 
> > 2.30.0.296.g2bfb1c46d8-goog
> 
> -- 
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/4] mm: cma: introduce gfp flag in cma_alloc instead of no_warn
  2021-01-25 13:07   ` Michal Hocko
@ 2021-01-25 19:42     ` Minchan Kim
  2021-01-26  7:38       ` Michal Hocko
  0 siblings, 1 reply; 31+ messages in thread
From: Minchan Kim @ 2021-01-25 19:42 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, hch, robh+dt, linaro-mm-sig

On Mon, Jan 25, 2021 at 02:07:01PM +0100, Michal Hocko wrote:
> On Thu 21-01-21 09:54:59, Minchan Kim wrote:
> > The upcoming patch will introduce __GFP_NORETRY semantic
> > in alloc_contig_range which is a failfast mode of the API.
> > Instead of adding a additional parameter for gfp, replace
> > no_warn with gfp flag.
> > 
> > To keep old behaviors, it follows the rule below.
> > 
> >   no_warn 			gfp_flags
> > 
> >   false         		GFP_KERNEL
> >   true          		GFP_KERNEL|__GFP_NOWARN
> >   gfp & __GFP_NOWARN		GFP_KERNEL | (gfp & __GFP_NOWARN)
> > 
> > Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> [...]
> > diff --git a/mm/cma.c b/mm/cma.c
> > index 0ba69cd16aeb..d50627686fec 100644
> > --- a/mm/cma.c
> > +++ b/mm/cma.c
> > @@ -419,13 +419,13 @@ static inline void cma_debug_show_areas(struct cma *cma) { }
> >   * @cma:   Contiguous memory region for which the allocation is performed.
> >   * @count: Requested number of pages.
> >   * @align: Requested alignment of pages (in PAGE_SIZE order).
> > - * @no_warn: Avoid printing message about failed allocation
> > + * @gfp_mask: GFP mask to use during the cma allocation.
> 
> Call out supported gfp flags explicitly. Have a look at kvmalloc_node
> for a guidance.

How about this?


diff --git a/mm/cma.c b/mm/cma.c
index d50627686fec..b94727b694d6 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -423,6 +423,10 @@ static inline void cma_debug_show_areas(struct cma *cma) { }
  *
  * This function allocates part of contiguous memory on specific
  * contiguous memory area.
+ *
+ * For gfp_mask, GFP_KERNEL and __GFP_NORETRY are supported. __GFP_NORETRY
+ * will avoid costly functions(e.g., waiting on page_writeback and locking)
+ * at current implementaion during the page migration.
  */
 struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
                       gfp_t gfp_mask)


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/4] dt-bindings: reserved-memory: Make DMA-BUF CMA heap DT-configurable
  2021-01-21 17:55 ` [PATCH v4 3/4] dt-bindings: reserved-memory: Make DMA-BUF CMA heap DT-configurable Minchan Kim
@ 2021-01-26  7:07   ` John Stultz
  2021-01-27 20:25     ` Hridya Valsaraju
  2021-02-05 22:55   ` Rob Herring
  1 sibling, 1 reply; 31+ messages in thread
From: John Stultz @ 2021-01-26  7:07 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, LKML, Hyesoo Yu, david, Michal Hocko,
	Suren Baghdasaryan, KyongHo Cho, John Dias, Hridya Valsaraju,
	Sumit Semwal, linux-media,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Christoph Hellwig, Rob Herring,
	moderated list:DMA BUFFER SHARING FRAMEWORK

On Thu, Jan 21, 2021 at 9:55 AM Minchan Kim <minchan@kernel.org> wrote:
>  .../reserved-memory/dma_heap_chunk.yaml       | 56 +++++++++++++++++++
>  1 file changed, 56 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/reserved-memory/dma_heap_chunk.yaml
>
> diff --git a/Documentation/devicetree/bindings/reserved-memory/dma_heap_chunk.yaml b/Documentation/devicetree/bindings/reserved-memory/dma_heap_chunk.yaml
> new file mode 100644
> index 000000000000..00db0ae6af61
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/reserved-memory/dma_heap_chunk.yaml
> @@ -0,0 +1,56 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/reserved-memory/dma_heap_chunk.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Device tree binding for chunk heap on DMA HEAP FRAMEWORK
> +
> +description: |
> +  The DMA chunk heap is backed by the Contiguous Memory Allocator (CMA) and
> +  supports bulk allocation of fixed size pages.
> +
> +maintainers:
> +  - Hyesoo Yu <hyesoo.yu@samsung.com>
> +  - John Stultz <john.stultz@linaro.org>
> +  - Minchan Kim <minchan@kernel.org>
> +  - Hridya Valsaraju<hridya@google.com>
> +
> +
> +properties:
> +  compatible:
> +    enum:
> +      - dma_heap,chunk
> +
> +  chunk-order:
> +    description: |
> +            order of pages that will get allocated from the chunk DMA heap.
> +    maxItems: 1
> +
> +  size:
> +    maxItems: 1
> +
> +  alignment:
> +    maxItems: 1
> +
> +required:
> +  - compatible
> +  - size
> +  - alignment
> +  - chunk-order
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +    reserved-memory {
> +        #address-cells = <2>;
> +        #size-cells = <1>;
> +
> +        chunk_memory: chunk_memory {
> +            compatible = "dma_heap,chunk";
> +            size = <0x3000000>;

Hey Minchan,
  Looking closer here, would it make more sense to document the "reg =
<>" parameter here as well instead of just "size = <>"?

That way the address of the region could be explicitly specified (for
instance, to ensure the CMA region created is 32bit addressable). And
more practically, trying to satisfy the base address alignment checks
in the final patch when its set dynamically may require a fair amount
of luck  - I couldn't manage it in my own testing on the hikey960 w/o
resorting to reg=  :)

It does look like the RESERVEDMEM_OF_DECLARE() logic already supports
this, so it's likely just a matter of documenting it here?

thanks
-john

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/4] dma-buf: heaps: add chunk heap to dmabuf heaps
  2021-01-21 17:55 ` [PATCH v4 4/4] dma-buf: heaps: add chunk heap to dmabuf heaps Minchan Kim
@ 2021-01-26  7:07   ` Christoph Hellwig
  2021-01-26 19:27     ` Minchan Kim
  2021-01-26  7:32   ` John Stultz
  2021-01-26  7:46   ` Michal Hocko
  2 siblings, 1 reply; 31+ messages in thread
From: Christoph Hellwig @ 2021-01-26  7:07 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, mhocko, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, hch, robh+dt, linaro-mm-sig

> +config DMABUF_HEAPS_CHUNK
> +	bool "DMA-BUF CHUNK Heap"
> +	depends on DMABUF_HEAPS && DMA_CMA
> +	help
> +	  Choose this option to enable dma-buf CHUNK heap. This heap is backed
> +	  by the Contiguous Memory Allocator (CMA) and allocates the buffers that
> +	  are arranged into a list of fixed size chunks taken from CMA.

Overly long line, which in text flowing text is really, really annoying.

Many more later.  Remember that > 80 chars are only allowed if they
significantly improve readability.  

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/4] dma-buf: heaps: add chunk heap to dmabuf heaps
  2021-01-21 17:55 ` [PATCH v4 4/4] dma-buf: heaps: add chunk heap to dmabuf heaps Minchan Kim
  2021-01-26  7:07   ` Christoph Hellwig
@ 2021-01-26  7:32   ` John Stultz
  2021-01-26 19:24     ` Minchan Kim
  2021-01-26  7:46   ` Michal Hocko
  2 siblings, 1 reply; 31+ messages in thread
From: John Stultz @ 2021-01-26  7:32 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, LKML, Hyesoo Yu, david, Michal Hocko,
	Suren Baghdasaryan, KyongHo Cho, John Dias, Hridya Valsaraju,
	Sumit Semwal, linux-media,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Christoph Hellwig, Rob Herring,
	moderated list:DMA BUFFER SHARING FRAMEWORK

On Thu, Jan 21, 2021 at 9:55 AM Minchan Kim <minchan@kernel.org> wrote:

Hey Minchan,
  Thanks for sending this out! I'm still working through testing with
this patch set, so I may have some more feedback tomorrow, but a few
quick items I did hit below.

> +
> +#define CHUNK_PREFIX "chunk-"
> +
> +static int register_chunk_heap(struct chunk_heap *chunk_heap_info)
> +{
> +       struct dma_heap_export_info exp_info;
> +       const char *name = cma_get_name(chunk_heap_info->cma);
> +       size_t len = strlen(CHUNK_PREFIX) + strlen(name) + 1;
> +       char *buf = kmalloc(len, GFP_KERNEL);
> +
> +       if (!buf)
> +               return -ENOMEM;
> +
> +       sprintf(buf, CHUNK_PREFIX"%s", cma_get_name(chunk_heap_info->cma));
> +       buf[len] = '\0';
> +
> +       exp_info.name = buf;
> +       exp_info.name = cma_get_name(chunk_heap_info->cma);

I think you intended to delete this line, as it's overwriting your
prefixed name.

> +       exp_info.ops = &chunk_heap_ops;
> +       exp_info.priv = chunk_heap_info;
> +
> +       chunk_heap_info->heap = dma_heap_add(&exp_info);
> +       if (IS_ERR(chunk_heap_info->heap)) {
> +               kfree(buf);
> +               return PTR_ERR(chunk_heap_info->heap);
> +       }
> +
> +       return 0;
> +}
> +
> +static int __init chunk_heap_init(void)
> +{
> +       unsigned int i;
> +
> +       for (i = 0; i < chunk_heap_count; i++)
> +               register_chunk_heap(&chunk_heaps[i]);
> +
> +       return 0;
> +}
> +module_init(chunk_heap_init);
> +
> +#ifdef CONFIG_OF_EARLY_FLATTREE
> +
> +static int __init dmabuf_chunk_heap_area_init(struct reserved_mem *rmem)
> +{
> +       int ret;
> +       struct cma *cma;
> +       struct chunk_heap *chunk_heap_info;
> +       const __be32 *chunk_order;
> +
> +       phys_addr_t align = PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order);
> +       phys_addr_t mask = align - 1;
> +
> +       if ((rmem->base & mask) || (rmem->size & mask)) {
> +               pr_err("Incorrect alignment for CMA region\n");
> +               return -EINVAL;

Passing this check can be tough if you're using dynamically assigned
rmem, so it might be helpful for debugging to print the base/size/mask
values?

thanks
-john

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/4] mm: cma: introduce gfp flag in cma_alloc instead of no_warn
  2021-01-25 19:42     ` Minchan Kim
@ 2021-01-26  7:38       ` Michal Hocko
  2021-01-26 19:12         ` Minchan Kim
  0 siblings, 1 reply; 31+ messages in thread
From: Michal Hocko @ 2021-01-26  7:38 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, hch, robh+dt, linaro-mm-sig

On Mon 25-01-21 11:42:34, Minchan Kim wrote:
> On Mon, Jan 25, 2021 at 02:07:01PM +0100, Michal Hocko wrote:
> > On Thu 21-01-21 09:54:59, Minchan Kim wrote:
> > > The upcoming patch will introduce __GFP_NORETRY semantic
> > > in alloc_contig_range which is a failfast mode of the API.
> > > Instead of adding a additional parameter for gfp, replace
> > > no_warn with gfp flag.
> > > 
> > > To keep old behaviors, it follows the rule below.
> > > 
> > >   no_warn 			gfp_flags
> > > 
> > >   false         		GFP_KERNEL
> > >   true          		GFP_KERNEL|__GFP_NOWARN
> > >   gfp & __GFP_NOWARN		GFP_KERNEL | (gfp & __GFP_NOWARN)
> > > 
> > > Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> > > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > [...]
> > > diff --git a/mm/cma.c b/mm/cma.c
> > > index 0ba69cd16aeb..d50627686fec 100644
> > > --- a/mm/cma.c
> > > +++ b/mm/cma.c
> > > @@ -419,13 +419,13 @@ static inline void cma_debug_show_areas(struct cma *cma) { }
> > >   * @cma:   Contiguous memory region for which the allocation is performed.
> > >   * @count: Requested number of pages.
> > >   * @align: Requested alignment of pages (in PAGE_SIZE order).
> > > - * @no_warn: Avoid printing message about failed allocation
> > > + * @gfp_mask: GFP mask to use during the cma allocation.
> > 
> > Call out supported gfp flags explicitly. Have a look at kvmalloc_node
> > for a guidance.
> 
> How about this?
> 
> 
> diff --git a/mm/cma.c b/mm/cma.c
> index d50627686fec..b94727b694d6 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -423,6 +423,10 @@ static inline void cma_debug_show_areas(struct cma *cma) { }
>   *
>   * This function allocates part of contiguous memory on specific
>   * contiguous memory area.
> + *
> + * For gfp_mask, GFP_KERNEL and __GFP_NORETRY are supported. __GFP_NORETRY
> + * will avoid costly functions(e.g., waiting on page_writeback and locking)
> + * at current implementaion during the page migration.

rather than explicitly mentioning what the flag implies I think it would
be more useful to state the intended usecase. See how kvmalloc_node says
"__GFP_RETRY_MAYFAIL is supported, and it should be used only if kmalloc is
preferable to the vmalloc fallback, due to visible performance
drawbacks.
__GFP_NOWARN is also supported to suppress allocation failure messages."

This would help people not familiar with internals to see whether this
flag is a good fit for them.

In this case I woul go with
"
@flags: gfp mask. Must be compatible (superset) with GFP_KERNEL.
[...]
Reclaim modifiers (__GFP_RETRY_MAYFAIL, __GFP_NOFAIL) are not supported.
__GFP_NORETRY is supported, and it should be used for opportunistic
allocation attempts that should rather fail quickly when the caller has
a fallback strategy.
"

Obviously for this patch you will go with a simple statement that
Reclaim modifiers are not supported at all.

>   */
>  struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
>                        gfp_t gfp_mask)
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 2/4] mm: failfast mode with __GFP_NORETRY in alloc_contig_range
  2021-01-25 19:33     ` Minchan Kim
@ 2021-01-26  7:44       ` Michal Hocko
  2021-01-26 19:10         ` Minchan Kim
  2021-01-27 20:42         ` Minchan Kim
  0 siblings, 2 replies; 31+ messages in thread
From: Michal Hocko @ 2021-01-26  7:44 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, hch, robh+dt, linaro-mm-sig

On Mon 25-01-21 11:33:36, Minchan Kim wrote:
> On Mon, Jan 25, 2021 at 02:12:00PM +0100, Michal Hocko wrote:
> > On Thu 21-01-21 09:55:00, Minchan Kim wrote:
> > > Contiguous memory allocation can be stalled due to waiting
> > > on page writeback and/or page lock which causes unpredictable
> > > delay. It's a unavoidable cost for the requestor to get *big*
> > > contiguous memory but it's expensive for *small* contiguous
> > > memory(e.g., order-4) because caller could retry the request
> > > in different range where would have easy migratable pages
> > > without stalling.
> > > 
> > > This patch introduce __GFP_NORETRY as compaction gfp_mask in
> > > alloc_contig_range so it will fail fast without blocking
> > > when it encounters pages needed waiting.
> > 
> > I am not against controling how hard this allocator tries with gfp mask
> > but this changelog is rather void on any data and any user.
> > 
> > It is also rather dubious to have retries when then caller says to not
> > retry.
> 
> Since max_tries is 1 with ++tries, it shouldn't retry.

OK, I have missed that. This is a tricky code. ASYNC mode should be
completely orthogonal to the retries count. Those are different things.
Page allocator does an explicit bail out based on __GFP_NORETRY. You
should be doing the same.

> > 
> > Also why didn't you consider GFP_NOWAIT semantic for non blocking mode?
> 
> GFP_NOWAIT seems to be low(specific) flags rather than the one I want to
> express. Even though I said only page writeback/lock in the description,
> the goal is to avoid costly operations we might find later so such
> "failfast", I thought GFP_NORETRY would be good fit.

I suspect you are too focused on implementation details here. Think
about the indended semantic. Callers of this functionality will not
think about those (I hope because if they rely on these details then the
whole thing will become unmaintainable because any change would require
an audit of all existing users). All you should be caring about is to
control how expensive the call can be. GFP_NOWAIT is not really low
level from that POV. It gives you a very lightweight non-sleeping
attempt to allocate. GFP_NORETRY will give you potentially sleeping but
an opportunistic-easy-to-fail attempt. And so on. See how that is
absolutely free of any page writeback or any specific locking.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/4] dma-buf: heaps: add chunk heap to dmabuf heaps
  2021-01-21 17:55 ` [PATCH v4 4/4] dma-buf: heaps: add chunk heap to dmabuf heaps Minchan Kim
  2021-01-26  7:07   ` Christoph Hellwig
  2021-01-26  7:32   ` John Stultz
@ 2021-01-26  7:46   ` Michal Hocko
  2021-01-26 19:25     ` Minchan Kim
  2 siblings, 1 reply; 31+ messages in thread
From: Michal Hocko @ 2021-01-26  7:46 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, hch, robh+dt, linaro-mm-sig

On Thu 21-01-21 09:55:02, Minchan Kim wrote:
> From: Hyesoo Yu <hyesoo.yu@samsung.com>
> 
> This patch supports chunk heap that allocates the buffers that
> arranged into a list a fixed size chunks taken from CMA.
> 
> The chunk heap driver is bound directly to a reserved_memory
> node by following Rob Herring's suggestion in [1].
> 
> [1] https://lore.kernel.org/lkml/20191025225009.50305-2-john.stultz@linaro.org/T/#m3dc63acd33fea269a584f43bb799a876f0b2b45d

Who is using this allocator in the kernel?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 2/4] mm: failfast mode with __GFP_NORETRY in alloc_contig_range
  2021-01-26  7:44       ` Michal Hocko
@ 2021-01-26 19:10         ` Minchan Kim
  2021-01-27  8:14           ` Michal Hocko
  2021-01-27 20:42         ` Minchan Kim
  1 sibling, 1 reply; 31+ messages in thread
From: Minchan Kim @ 2021-01-26 19:10 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, hch, robh+dt, linaro-mm-sig

On Tue, Jan 26, 2021 at 08:44:49AM +0100, Michal Hocko wrote:
> On Mon 25-01-21 11:33:36, Minchan Kim wrote:
> > On Mon, Jan 25, 2021 at 02:12:00PM +0100, Michal Hocko wrote:
> > > On Thu 21-01-21 09:55:00, Minchan Kim wrote:
> > > > Contiguous memory allocation can be stalled due to waiting
> > > > on page writeback and/or page lock which causes unpredictable
> > > > delay. It's a unavoidable cost for the requestor to get *big*
> > > > contiguous memory but it's expensive for *small* contiguous
> > > > memory(e.g., order-4) because caller could retry the request
> > > > in different range where would have easy migratable pages
> > > > without stalling.
> > > > 
> > > > This patch introduce __GFP_NORETRY as compaction gfp_mask in
> > > > alloc_contig_range so it will fail fast without blocking
> > > > when it encounters pages needed waiting.
> > > 
> > > I am not against controling how hard this allocator tries with gfp mask
> > > but this changelog is rather void on any data and any user.
> > > 
> > > It is also rather dubious to have retries when then caller says to not
> > > retry.
> > 
> > Since max_tries is 1 with ++tries, it shouldn't retry.
> 
> OK, I have missed that. This is a tricky code. ASYNC mode should be
> completely orthogonal to the retries count. Those are different things.
> Page allocator does an explicit bail out based on __GFP_NORETRY. You
> should be doing the same.

A concern with __GFP_NOWAIT is regardless of flags passed to cma_alloc,
internal implementation of alloc_contig_range inside will use blockable
operation. See __alloc_contig_migrate_range.

If we go with __GFP_NOWAIT, we should propagate the gfp_mask inside of
__alloc_contig_migrate_range to make cma_alloc consistent with alloc_pages.
(IIUC, that's what you want - make gfp_mask consistent between cma_alloc
and alloc_pages) but I am worry about the direction will make complicate
situation since cma invovles migration context as well as target page
allocation context. Sometime, the single gfp flag could be trouble
to express both contexts all at once. 

> 
> > > 
> > > Also why didn't you consider GFP_NOWAIT semantic for non blocking mode?
> > 
> > GFP_NOWAIT seems to be low(specific) flags rather than the one I want to
> > express. Even though I said only page writeback/lock in the description,
> > the goal is to avoid costly operations we might find later so such
> > "failfast", I thought GFP_NORETRY would be good fit.
> 
> I suspect you are too focused on implementation details here. Think
> about the indended semantic. Callers of this functionality will not
> think about those (I hope because if they rely on these details then the
> whole thing will become unmaintainable because any change would require
> an audit of all existing users). All you should be caring about is to
> control how expensive the call can be. GFP_NOWAIT is not really low
> level from that POV. It gives you a very lightweight non-sleeping
> attempt to allocate. GFP_NORETRY will give you potentially sleeping but
> an opportunistic-easy-to-fail attempt. And so on. See how that is
> absolutely free of any page writeback or any specific locking.

With above reason I mentioned, I wanted to express __GFP_NORETRY as 
"opportunistic-easy-to-fail attempt" to support cma_alloc as "failfast"
for migration context.

> -- 
> Michal Hocko
> SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/4] mm: cma: introduce gfp flag in cma_alloc instead of no_warn
  2021-01-26  7:38       ` Michal Hocko
@ 2021-01-26 19:12         ` Minchan Kim
  2021-01-27 20:21           ` Minchan Kim
  0 siblings, 1 reply; 31+ messages in thread
From: Minchan Kim @ 2021-01-26 19:12 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, hch, robh+dt, linaro-mm-sig

On Tue, Jan 26, 2021 at 08:38:08AM +0100, Michal Hocko wrote:
> On Mon 25-01-21 11:42:34, Minchan Kim wrote:
> > On Mon, Jan 25, 2021 at 02:07:01PM +0100, Michal Hocko wrote:
> > > On Thu 21-01-21 09:54:59, Minchan Kim wrote:
> > > > The upcoming patch will introduce __GFP_NORETRY semantic
> > > > in alloc_contig_range which is a failfast mode of the API.
> > > > Instead of adding a additional parameter for gfp, replace
> > > > no_warn with gfp flag.
> > > > 
> > > > To keep old behaviors, it follows the rule below.
> > > > 
> > > >   no_warn 			gfp_flags
> > > > 
> > > >   false         		GFP_KERNEL
> > > >   true          		GFP_KERNEL|__GFP_NOWARN
> > > >   gfp & __GFP_NOWARN		GFP_KERNEL | (gfp & __GFP_NOWARN)
> > > > 
> > > > Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> > > > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > > [...]
> > > > diff --git a/mm/cma.c b/mm/cma.c
> > > > index 0ba69cd16aeb..d50627686fec 100644
> > > > --- a/mm/cma.c
> > > > +++ b/mm/cma.c
> > > > @@ -419,13 +419,13 @@ static inline void cma_debug_show_areas(struct cma *cma) { }
> > > >   * @cma:   Contiguous memory region for which the allocation is performed.
> > > >   * @count: Requested number of pages.
> > > >   * @align: Requested alignment of pages (in PAGE_SIZE order).
> > > > - * @no_warn: Avoid printing message about failed allocation
> > > > + * @gfp_mask: GFP mask to use during the cma allocation.
> > > 
> > > Call out supported gfp flags explicitly. Have a look at kvmalloc_node
> > > for a guidance.
> > 
> > How about this?
> > 
> > 
> > diff --git a/mm/cma.c b/mm/cma.c
> > index d50627686fec..b94727b694d6 100644
> > --- a/mm/cma.c
> > +++ b/mm/cma.c
> > @@ -423,6 +423,10 @@ static inline void cma_debug_show_areas(struct cma *cma) { }
> >   *
> >   * This function allocates part of contiguous memory on specific
> >   * contiguous memory area.
> > + *
> > + * For gfp_mask, GFP_KERNEL and __GFP_NORETRY are supported. __GFP_NORETRY
> > + * will avoid costly functions(e.g., waiting on page_writeback and locking)
> > + * at current implementaion during the page migration.
> 
> rather than explicitly mentioning what the flag implies I think it would
> be more useful to state the intended usecase. See how kvmalloc_node says
> "__GFP_RETRY_MAYFAIL is supported, and it should be used only if kmalloc is
> preferable to the vmalloc fallback, due to visible performance
> drawbacks.
> __GFP_NOWARN is also supported to suppress allocation failure messages."
> 
> This would help people not familiar with internals to see whether this
> flag is a good fit for them.
> 
> In this case I woul go with
> "
> @flags: gfp mask. Must be compatible (superset) with GFP_KERNEL.
> [...]
> Reclaim modifiers (__GFP_RETRY_MAYFAIL, __GFP_NOFAIL) are not supported.
> __GFP_NORETRY is supported, and it should be used for opportunistic
> allocation attempts that should rather fail quickly when the caller has
> a fallback strategy.
> "
> 
> Obviously for this patch you will go with a simple statement that
> Reclaim modifiers are not supported at all.

After more discussion for gfp_flags in thread of next patch, let me
changes a bit more based on it.

Thanks for the suggestion, Michal.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/4] dma-buf: heaps: add chunk heap to dmabuf heaps
  2021-01-26  7:32   ` John Stultz
@ 2021-01-26 19:24     ` Minchan Kim
  0 siblings, 0 replies; 31+ messages in thread
From: Minchan Kim @ 2021-01-26 19:24 UTC (permalink / raw)
  To: John Stultz
  Cc: Andrew Morton, linux-mm, LKML, Hyesoo Yu, david, Michal Hocko,
	Suren Baghdasaryan, KyongHo Cho, John Dias, Hridya Valsaraju,
	Sumit Semwal, linux-media,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Christoph Hellwig, Rob Herring,
	moderated list:DMA BUFFER SHARING FRAMEWORK

On Mon, Jan 25, 2021 at 11:32:57PM -0800, John Stultz wrote:
> On Thu, Jan 21, 2021 at 9:55 AM Minchan Kim <minchan@kernel.org> wrote:
> 
> Hey Minchan,
>   Thanks for sending this out! I'm still working through testing with
> this patch set, so I may have some more feedback tomorrow, but a few
> quick items I did hit below.
> 
> > +
> > +#define CHUNK_PREFIX "chunk-"
> > +
> > +static int register_chunk_heap(struct chunk_heap *chunk_heap_info)
> > +{
> > +       struct dma_heap_export_info exp_info;
> > +       const char *name = cma_get_name(chunk_heap_info->cma);
> > +       size_t len = strlen(CHUNK_PREFIX) + strlen(name) + 1;
> > +       char *buf = kmalloc(len, GFP_KERNEL);
> > +
> > +       if (!buf)
> > +               return -ENOMEM;
> > +
> > +       sprintf(buf, CHUNK_PREFIX"%s", cma_get_name(chunk_heap_info->cma));
> > +       buf[len] = '\0';
> > +
> > +       exp_info.name = buf;
> > +       exp_info.name = cma_get_name(chunk_heap_info->cma);
> 
> I think you intended to delete this line, as it's overwriting your
> prefixed name.

Hi John,

You're right. Will fix it.

> 
> > +       exp_info.ops = &chunk_heap_ops;
> > +       exp_info.priv = chunk_heap_info;
> > +
> > +       chunk_heap_info->heap = dma_heap_add(&exp_info);
> > +       if (IS_ERR(chunk_heap_info->heap)) {
> > +               kfree(buf);
> > +               return PTR_ERR(chunk_heap_info->heap);
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static int __init chunk_heap_init(void)
> > +{
> > +       unsigned int i;
> > +
> > +       for (i = 0; i < chunk_heap_count; i++)
> > +               register_chunk_heap(&chunk_heaps[i]);
> > +
> > +       return 0;
> > +}
> > +module_init(chunk_heap_init);
> > +
> > +#ifdef CONFIG_OF_EARLY_FLATTREE
> > +
> > +static int __init dmabuf_chunk_heap_area_init(struct reserved_mem *rmem)
> > +{
> > +       int ret;
> > +       struct cma *cma;
> > +       struct chunk_heap *chunk_heap_info;
> > +       const __be32 *chunk_order;
> > +
> > +       phys_addr_t align = PAGE_SIZE << max(MAX_ORDER - 1, pageblock_order);
> > +       phys_addr_t mask = align - 1;
> > +
> > +       if ((rmem->base & mask) || (rmem->size & mask)) {
> > +               pr_err("Incorrect alignment for CMA region\n");
> > +               return -EINVAL;
> 
> Passing this check can be tough if you're using dynamically assigned
> rmem, so it might be helpful for debugging to print the base/size/mask
> values?

Let me fold this into next respin.

diff --git a/drivers/dma-buf/heaps/chunk_heap.c b/drivers/dma-buf/heaps/chunk_heap.c
index 6fe8e69d108f..cc2ed5341b54 100644
--- a/drivers/dma-buf/heaps/chunk_heap.c
+++ b/drivers/dma-buf/heaps/chunk_heap.c
@@ -456,7 +456,8 @@ static int __init dmabuf_chunk_heap_area_init(struct reserved_mem *rmem)
        phys_addr_t mask = align - 1;

        if ((rmem->base & mask) || (rmem->size & mask)) {
-               pr_err("Incorrect alignment for CMA region\n");
+               pr_err("Incorrect alignment for CMA region: base %pa size %pa mask %pa\n",
+                               rmem->base, rmem->size, mask);
                return -EINVAL;
        }

Thanks for the review, John!

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/4] dma-buf: heaps: add chunk heap to dmabuf heaps
  2021-01-26  7:46   ` Michal Hocko
@ 2021-01-26 19:25     ` Minchan Kim
  2021-01-27  8:09       ` Michal Hocko
  0 siblings, 1 reply; 31+ messages in thread
From: Minchan Kim @ 2021-01-26 19:25 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, hch, robh+dt, linaro-mm-sig

On Tue, Jan 26, 2021 at 08:46:05AM +0100, Michal Hocko wrote:
> On Thu 21-01-21 09:55:02, Minchan Kim wrote:
> > From: Hyesoo Yu <hyesoo.yu@samsung.com>
> > 
> > This patch supports chunk heap that allocates the buffers that
> > arranged into a list a fixed size chunks taken from CMA.
> > 
> > The chunk heap driver is bound directly to a reserved_memory
> > node by following Rob Herring's suggestion in [1].
> > 
> > [1] https://lore.kernel.org/lkml/20191025225009.50305-2-john.stultz@linaro.org/T/#m3dc63acd33fea269a584f43bb799a876f0b2b45d
> 
> Who is using this allocator in the kernel?

Userspace uses the memory via mapping it via dmabuf.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/4] dma-buf: heaps: add chunk heap to dmabuf heaps
  2021-01-26  7:07   ` Christoph Hellwig
@ 2021-01-26 19:27     ` Minchan Kim
  0 siblings, 0 replies; 31+ messages in thread
From: Minchan Kim @ 2021-01-26 19:27 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, mhocko, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, robh+dt, linaro-mm-sig

On Tue, Jan 26, 2021 at 07:07:44AM +0000, Christoph Hellwig wrote:
> > +config DMABUF_HEAPS_CHUNK
> > +	bool "DMA-BUF CHUNK Heap"
> > +	depends on DMABUF_HEAPS && DMA_CMA
> > +	help
> > +	  Choose this option to enable dma-buf CHUNK heap. This heap is backed
> > +	  by the Contiguous Memory Allocator (CMA) and allocates the buffers that
> > +	  are arranged into a list of fixed size chunks taken from CMA.
> 
> Overly long line, which in text flowing text is really, really annoying.
> 
> Many more later.  Remember that > 80 chars are only allowed if they
> significantly improve readability.  

Let me make it shorter than 80 chars in next revision.
Thanks for the reivew, Christoph.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 4/4] dma-buf: heaps: add chunk heap to dmabuf heaps
  2021-01-26 19:25     ` Minchan Kim
@ 2021-01-27  8:09       ` Michal Hocko
  0 siblings, 0 replies; 31+ messages in thread
From: Michal Hocko @ 2021-01-27  8:09 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, hch, robh+dt, linaro-mm-sig

On Tue 26-01-21 11:25:36, Minchan Kim wrote:
> On Tue, Jan 26, 2021 at 08:46:05AM +0100, Michal Hocko wrote:
> > On Thu 21-01-21 09:55:02, Minchan Kim wrote:
> > > From: Hyesoo Yu <hyesoo.yu@samsung.com>
> > > 
> > > This patch supports chunk heap that allocates the buffers that
> > > arranged into a list a fixed size chunks taken from CMA.
> > > 
> > > The chunk heap driver is bound directly to a reserved_memory
> > > node by following Rob Herring's suggestion in [1].
> > > 
> > > [1] https://lore.kernel.org/lkml/20191025225009.50305-2-john.stultz@linaro.org/T/#m3dc63acd33fea269a584f43bb799a876f0b2b45d
> > 
> > Who is using this allocator in the kernel?
> 
> Userspace uses the memory via mapping it via dmabuf.

Ohh, I see. I thought this was an interface to consume in the kernel.
Thanks for the clarification!

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 2/4] mm: failfast mode with __GFP_NORETRY in alloc_contig_range
  2021-01-26 19:10         ` Minchan Kim
@ 2021-01-27  8:14           ` Michal Hocko
  0 siblings, 0 replies; 31+ messages in thread
From: Michal Hocko @ 2021-01-27  8:14 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, hch, robh+dt, linaro-mm-sig

On Tue 26-01-21 11:10:18, Minchan Kim wrote:
> On Tue, Jan 26, 2021 at 08:44:49AM +0100, Michal Hocko wrote:
> > On Mon 25-01-21 11:33:36, Minchan Kim wrote:
> > > On Mon, Jan 25, 2021 at 02:12:00PM +0100, Michal Hocko wrote:
> > > > On Thu 21-01-21 09:55:00, Minchan Kim wrote:
> > > > > Contiguous memory allocation can be stalled due to waiting
> > > > > on page writeback and/or page lock which causes unpredictable
> > > > > delay. It's a unavoidable cost for the requestor to get *big*
> > > > > contiguous memory but it's expensive for *small* contiguous
> > > > > memory(e.g., order-4) because caller could retry the request
> > > > > in different range where would have easy migratable pages
> > > > > without stalling.
> > > > > 
> > > > > This patch introduce __GFP_NORETRY as compaction gfp_mask in
> > > > > alloc_contig_range so it will fail fast without blocking
> > > > > when it encounters pages needed waiting.
> > > > 
> > > > I am not against controling how hard this allocator tries with gfp mask
> > > > but this changelog is rather void on any data and any user.
> > > > 
> > > > It is also rather dubious to have retries when then caller says to not
> > > > retry.
> > > 
> > > Since max_tries is 1 with ++tries, it shouldn't retry.
> > 
> > OK, I have missed that. This is a tricky code. ASYNC mode should be
> > completely orthogonal to the retries count. Those are different things.
> > Page allocator does an explicit bail out based on __GFP_NORETRY. You
> > should be doing the same.
> 
> A concern with __GFP_NOWAIT is regardless of flags passed to cma_alloc,
> internal implementation of alloc_contig_range inside will use blockable
> operation. See __alloc_contig_migrate_range.

Yes it is now. But nothing should prevent from making it non blockable.

> If we go with __GFP_NOWAIT, we should propagate the gfp_mask inside of
> __alloc_contig_migrate_range to make cma_alloc consistent with alloc_pages.

Absolutely. You should be doing that anyway. As I've said above you
shouldn't rely on side effects like ASYNC mode.

> (IIUC, that's what you want - make gfp_mask consistent between cma_alloc
> and alloc_pages) but I am worry about the direction will make complicate
> situation since cma invovles migration context as well as target page
> allocation context. Sometime, the single gfp flag could be trouble
> to express both contexts all at once. 

I am not sure I see your concern.

> > > > Also why didn't you consider GFP_NOWAIT semantic for non blocking mode?
> > > 
> > > GFP_NOWAIT seems to be low(specific) flags rather than the one I want to
> > > express. Even though I said only page writeback/lock in the description,
> > > the goal is to avoid costly operations we might find later so such
> > > "failfast", I thought GFP_NORETRY would be good fit.
> > 
> > I suspect you are too focused on implementation details here. Think
> > about the indended semantic. Callers of this functionality will not
> > think about those (I hope because if they rely on these details then the
> > whole thing will become unmaintainable because any change would require
> > an audit of all existing users). All you should be caring about is to
> > control how expensive the call can be. GFP_NOWAIT is not really low
> > level from that POV. It gives you a very lightweight non-sleeping
> > attempt to allocate. GFP_NORETRY will give you potentially sleeping but
> > an opportunistic-easy-to-fail attempt. And so on. See how that is
> > absolutely free of any page writeback or any specific locking.
> 
> With above reason I mentioned, I wanted to express __GFP_NORETRY as 
> "opportunistic-easy-to-fail attempt" to support cma_alloc as "failfast"
> for migration context.

Yes that is fine. And please note that I do not push for NOWAIT
semantic. If there is no user for that now then fine.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 1/4] mm: cma: introduce gfp flag in cma_alloc instead of no_warn
  2021-01-26 19:12         ` Minchan Kim
@ 2021-01-27 20:21           ` Minchan Kim
  0 siblings, 0 replies; 31+ messages in thread
From: Minchan Kim @ 2021-01-27 20:21 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, hch, robh+dt, linaro-mm-sig

On Tue, Jan 26, 2021 at 11:12:14AM -0800, Minchan Kim wrote:
> On Tue, Jan 26, 2021 at 08:38:08AM +0100, Michal Hocko wrote:
> > On Mon 25-01-21 11:42:34, Minchan Kim wrote:
> > > On Mon, Jan 25, 2021 at 02:07:01PM +0100, Michal Hocko wrote:
> > > > On Thu 21-01-21 09:54:59, Minchan Kim wrote:
> > > > > The upcoming patch will introduce __GFP_NORETRY semantic
> > > > > in alloc_contig_range which is a failfast mode of the API.
> > > > > Instead of adding a additional parameter for gfp, replace
> > > > > no_warn with gfp flag.
> > > > > 
> > > > > To keep old behaviors, it follows the rule below.
> > > > > 
> > > > >   no_warn 			gfp_flags
> > > > > 
> > > > >   false         		GFP_KERNEL
> > > > >   true          		GFP_KERNEL|__GFP_NOWARN
> > > > >   gfp & __GFP_NOWARN		GFP_KERNEL | (gfp & __GFP_NOWARN)
> > > > > 
> > > > > Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> > > > > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > > > [...]
> > > > > diff --git a/mm/cma.c b/mm/cma.c
> > > > > index 0ba69cd16aeb..d50627686fec 100644
> > > > > --- a/mm/cma.c
> > > > > +++ b/mm/cma.c
> > > > > @@ -419,13 +419,13 @@ static inline void cma_debug_show_areas(struct cma *cma) { }
> > > > >   * @cma:   Contiguous memory region for which the allocation is performed.
> > > > >   * @count: Requested number of pages.
> > > > >   * @align: Requested alignment of pages (in PAGE_SIZE order).
> > > > > - * @no_warn: Avoid printing message about failed allocation
> > > > > + * @gfp_mask: GFP mask to use during the cma allocation.
> > > > 
> > > > Call out supported gfp flags explicitly. Have a look at kvmalloc_node
> > > > for a guidance.
> > > 
> > > How about this?
> > > 
> > > 
> > > diff --git a/mm/cma.c b/mm/cma.c
> > > index d50627686fec..b94727b694d6 100644
> > > --- a/mm/cma.c
> > > +++ b/mm/cma.c
> > > @@ -423,6 +423,10 @@ static inline void cma_debug_show_areas(struct cma *cma) { }
> > >   *
> > >   * This function allocates part of contiguous memory on specific
> > >   * contiguous memory area.
> > > + *
> > > + * For gfp_mask, GFP_KERNEL and __GFP_NORETRY are supported. __GFP_NORETRY
> > > + * will avoid costly functions(e.g., waiting on page_writeback and locking)
> > > + * at current implementaion during the page migration.
> > 
> > rather than explicitly mentioning what the flag implies I think it would
> > be more useful to state the intended usecase. See how kvmalloc_node says
> > "__GFP_RETRY_MAYFAIL is supported, and it should be used only if kmalloc is
> > preferable to the vmalloc fallback, due to visible performance
> > drawbacks.
> > __GFP_NOWARN is also supported to suppress allocation failure messages."
> > 
> > This would help people not familiar with internals to see whether this
> > flag is a good fit for them.
> > 
> > In this case I woul go with
> > "
> > @flags: gfp mask. Must be compatible (superset) with GFP_KERNEL.
> > [...]
> > Reclaim modifiers (__GFP_RETRY_MAYFAIL, __GFP_NOFAIL) are not supported.
> > __GFP_NORETRY is supported, and it should be used for opportunistic
> > allocation attempts that should rather fail quickly when the caller has
> > a fallback strategy.
> > "
> > 
> > Obviously for this patch you will go with a simple statement that
> > Reclaim modifiers are not supported at all.
> 
> After more discussion for gfp_flags in thread of next patch, let me
> changes a bit more based on it.
> 
> Thanks for the suggestion, Michal.

Based on the discussion in other thread, I want to go with __GFP_NORETRY
to indicate "opportunistic-easy-to-fail attempt" so suggestion words
Michal is valid so I will carry it on next version.

Thank you.





^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/4] dt-bindings: reserved-memory: Make DMA-BUF CMA heap DT-configurable
  2021-01-26  7:07   ` John Stultz
@ 2021-01-27 20:25     ` Hridya Valsaraju
  0 siblings, 0 replies; 31+ messages in thread
From: Hridya Valsaraju @ 2021-01-27 20:25 UTC (permalink / raw)
  To: John Stultz
  Cc: Minchan Kim, Andrew Morton, linux-mm, LKML, Hyesoo Yu, david,
	Michal Hocko, Suren Baghdasaryan, KyongHo Cho, John Dias,
	Sumit Semwal, linux-media,
	open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS,
	Christoph Hellwig, Rob Herring,
	moderated list:DMA BUFFER SHARING FRAMEWORK

On Mon, Jan 25, 2021 at 11:07 PM John Stultz <john.stultz@linaro.org> wrote:
>
> On Thu, Jan 21, 2021 at 9:55 AM Minchan Kim <minchan@kernel.org> wrote:
> >  .../reserved-memory/dma_heap_chunk.yaml       | 56 +++++++++++++++++++
> >  1 file changed, 56 insertions(+)
> >  create mode 100644 Documentation/devicetree/bindings/reserved-memory/dma_heap_chunk.yaml
> >
> > diff --git a/Documentation/devicetree/bindings/reserved-memory/dma_heap_chunk.yaml b/Documentation/devicetree/bindings/reserved-memory/dma_heap_chunk.yaml
> > new file mode 100644
> > index 000000000000..00db0ae6af61
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/reserved-memory/dma_heap_chunk.yaml
> > @@ -0,0 +1,56 @@
> > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > +%YAML 1.2
> > +---
> > +$id: http://devicetree.org/schemas/reserved-memory/dma_heap_chunk.yaml#
> > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > +
> > +title: Device tree binding for chunk heap on DMA HEAP FRAMEWORK
> > +
> > +description: |
> > +  The DMA chunk heap is backed by the Contiguous Memory Allocator (CMA) and
> > +  supports bulk allocation of fixed size pages.
> > +
> > +maintainers:
> > +  - Hyesoo Yu <hyesoo.yu@samsung.com>
> > +  - John Stultz <john.stultz@linaro.org>
> > +  - Minchan Kim <minchan@kernel.org>
> > +  - Hridya Valsaraju<hridya@google.com>
> > +
> > +
> > +properties:
> > +  compatible:
> > +    enum:
> > +      - dma_heap,chunk
> > +
> > +  chunk-order:
> > +    description: |
> > +            order of pages that will get allocated from the chunk DMA heap.
> > +    maxItems: 1
> > +
> > +  size:
> > +    maxItems: 1
> > +
> > +  alignment:
> > +    maxItems: 1
> > +
> > +required:
> > +  - compatible
> > +  - size
> > +  - alignment
> > +  - chunk-order
> > +
> > +additionalProperties: false
> > +
> > +examples:
> > +  - |
> > +    reserved-memory {
> > +        #address-cells = <2>;
> > +        #size-cells = <1>;
> > +
> > +        chunk_memory: chunk_memory {
> > +            compatible = "dma_heap,chunk";
> > +            size = <0x3000000>;
>
> Hey Minchan,
>   Looking closer here, would it make more sense to document the "reg =
> <>" parameter here as well instead of just "size = <>"?
>
> That way the address of the region could be explicitly specified (for
> instance, to ensure the CMA region created is 32bit addressable). And
> more practically, trying to satisfy the base address alignment checks
> in the final patch when its set dynamically may require a fair amount
> of luck  - I couldn't manage it in my own testing on the hikey960 w/o
> resorting to reg=  :)
>
> It does look like the RESERVEDMEM_OF_DECLARE() logic already supports
> this, so it's likely just a matter of documenting it here?

Thank you John, yes, that makes sense. We will add the 'reg' parameter
as well when we send out the next version.

Regards,
Hridya

>
> thanks
> -john

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 2/4] mm: failfast mode with __GFP_NORETRY in alloc_contig_range
  2021-01-26  7:44       ` Michal Hocko
  2021-01-26 19:10         ` Minchan Kim
@ 2021-01-27 20:42         ` Minchan Kim
  2021-01-28  7:53           ` Michal Hocko
  1 sibling, 1 reply; 31+ messages in thread
From: Minchan Kim @ 2021-01-27 20:42 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, hch, robh+dt, linaro-mm-sig

On Tue, Jan 26, 2021 at 08:44:49AM +0100, Michal Hocko wrote:
> On Mon 25-01-21 11:33:36, Minchan Kim wrote:
> > On Mon, Jan 25, 2021 at 02:12:00PM +0100, Michal Hocko wrote:
> > > On Thu 21-01-21 09:55:00, Minchan Kim wrote:
> > > > Contiguous memory allocation can be stalled due to waiting
> > > > on page writeback and/or page lock which causes unpredictable
> > > > delay. It's a unavoidable cost for the requestor to get *big*
> > > > contiguous memory but it's expensive for *small* contiguous
> > > > memory(e.g., order-4) because caller could retry the request
> > > > in different range where would have easy migratable pages
> > > > without stalling.
> > > > 
> > > > This patch introduce __GFP_NORETRY as compaction gfp_mask in
> > > > alloc_contig_range so it will fail fast without blocking
> > > > when it encounters pages needed waiting.
> > > 
> > > I am not against controling how hard this allocator tries with gfp mask
> > > but this changelog is rather void on any data and any user.
> > > 
> > > It is also rather dubious to have retries when then caller says to not
> > > retry.
> > 
> > Since max_tries is 1 with ++tries, it shouldn't retry.
> 
> OK, I have missed that. This is a tricky code. ASYNC mode should be
> completely orthogonal to the retries count. Those are different things.
> Page allocator does an explicit bail out based on __GFP_NORETRY. You
> should be doing the same.

Before sending next revision, let me check this part again.

I want to use __GFP_NORETRY to indicate "opportunistic-easy-to-fail attempt"
and I want to use ASYNC migrate_mode to help the goal.

Do you see the problem?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 2/4] mm: failfast mode with __GFP_NORETRY in alloc_contig_range
  2021-01-27 20:42         ` Minchan Kim
@ 2021-01-28  7:53           ` Michal Hocko
  2021-01-28 16:56             ` Minchan Kim
  0 siblings, 1 reply; 31+ messages in thread
From: Michal Hocko @ 2021-01-28  7:53 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, hch, robh+dt, linaro-mm-sig

On Wed 27-01-21 12:42:45, Minchan Kim wrote:
> On Tue, Jan 26, 2021 at 08:44:49AM +0100, Michal Hocko wrote:
> > On Mon 25-01-21 11:33:36, Minchan Kim wrote:
> > > On Mon, Jan 25, 2021 at 02:12:00PM +0100, Michal Hocko wrote:
> > > > On Thu 21-01-21 09:55:00, Minchan Kim wrote:
> > > > > Contiguous memory allocation can be stalled due to waiting
> > > > > on page writeback and/or page lock which causes unpredictable
> > > > > delay. It's a unavoidable cost for the requestor to get *big*
> > > > > contiguous memory but it's expensive for *small* contiguous
> > > > > memory(e.g., order-4) because caller could retry the request
> > > > > in different range where would have easy migratable pages
> > > > > without stalling.
> > > > > 
> > > > > This patch introduce __GFP_NORETRY as compaction gfp_mask in
> > > > > alloc_contig_range so it will fail fast without blocking
> > > > > when it encounters pages needed waiting.
> > > > 
> > > > I am not against controling how hard this allocator tries with gfp mask
> > > > but this changelog is rather void on any data and any user.
> > > > 
> > > > It is also rather dubious to have retries when then caller says to not
> > > > retry.
> > > 
> > > Since max_tries is 1 with ++tries, it shouldn't retry.
> > 
> > OK, I have missed that. This is a tricky code. ASYNC mode should be
> > completely orthogonal to the retries count. Those are different things.
> > Page allocator does an explicit bail out based on __GFP_NORETRY. You
> > should be doing the same.
> 
> Before sending next revision, let me check this part again.
> 
> I want to use __GFP_NORETRY to indicate "opportunistic-easy-to-fail attempt"
> and I want to use ASYNC migrate_mode to help the goal.
> 
> Do you see the problem?

No, as I've said. This is a normal NORETRY policy. And ASYNC migration
is a mere implementation detail you do not have bother your users about.
This is the semantic view. From the implementation POV it should be the
gfp mask to drive decisions rather than a random (ASYNC) flag to control
retries as you did here.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 2/4] mm: failfast mode with __GFP_NORETRY in alloc_contig_range
  2021-01-28  7:53           ` Michal Hocko
@ 2021-01-28 16:56             ` Minchan Kim
  0 siblings, 0 replies; 31+ messages in thread
From: Minchan Kim @ 2021-01-28 16:56 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, hch, robh+dt, linaro-mm-sig

On Thu, Jan 28, 2021 at 08:53:25AM +0100, Michal Hocko wrote:
> On Wed 27-01-21 12:42:45, Minchan Kim wrote:
> > On Tue, Jan 26, 2021 at 08:44:49AM +0100, Michal Hocko wrote:
> > > On Mon 25-01-21 11:33:36, Minchan Kim wrote:
> > > > On Mon, Jan 25, 2021 at 02:12:00PM +0100, Michal Hocko wrote:
> > > > > On Thu 21-01-21 09:55:00, Minchan Kim wrote:
> > > > > > Contiguous memory allocation can be stalled due to waiting
> > > > > > on page writeback and/or page lock which causes unpredictable
> > > > > > delay. It's a unavoidable cost for the requestor to get *big*
> > > > > > contiguous memory but it's expensive for *small* contiguous
> > > > > > memory(e.g., order-4) because caller could retry the request
> > > > > > in different range where would have easy migratable pages
> > > > > > without stalling.
> > > > > > 
> > > > > > This patch introduce __GFP_NORETRY as compaction gfp_mask in
> > > > > > alloc_contig_range so it will fail fast without blocking
> > > > > > when it encounters pages needed waiting.
> > > > > 
> > > > > I am not against controling how hard this allocator tries with gfp mask
> > > > > but this changelog is rather void on any data and any user.
> > > > > 
> > > > > It is also rather dubious to have retries when then caller says to not
> > > > > retry.
> > > > 
> > > > Since max_tries is 1 with ++tries, it shouldn't retry.
> > > 
> > > OK, I have missed that. This is a tricky code. ASYNC mode should be
> > > completely orthogonal to the retries count. Those are different things.
> > > Page allocator does an explicit bail out based on __GFP_NORETRY. You
> > > should be doing the same.
> > 
> > Before sending next revision, let me check this part again.
> > 
> > I want to use __GFP_NORETRY to indicate "opportunistic-easy-to-fail attempt"
> > and I want to use ASYNC migrate_mode to help the goal.
> > 
> > Do you see the problem?
> 
> No, as I've said. This is a normal NORETRY policy. And ASYNC migration
> is a mere implementation detail you do not have bother your users about.
> This is the semantic view. From the implementation POV it should be the
> gfp mask to drive decisions rather than a random (ASYNC) flag to control
> retries as you did here.

Make sense.

Let me cook next revision.

Thanks for the review, Michal.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v4 3/4] dt-bindings: reserved-memory: Make DMA-BUF CMA heap DT-configurable
  2021-01-21 17:55 ` [PATCH v4 3/4] dt-bindings: reserved-memory: Make DMA-BUF CMA heap DT-configurable Minchan Kim
  2021-01-26  7:07   ` John Stultz
@ 2021-02-05 22:55   ` Rob Herring
  1 sibling, 0 replies; 31+ messages in thread
From: Rob Herring @ 2021-02-05 22:55 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, LKML, hyesoo.yu, david, mhocko, surenb,
	pullip.cho, joaodias, hridya, john.stultz, sumit.semwal,
	linux-media, devicetree, hch, linaro-mm-sig

On Thu, Jan 21, 2021 at 09:55:01AM -0800, Minchan Kim wrote:
> From: Hyesoo Yu <hyesoo.yu@samsung.com>
> 
> Document devicetree binding for chunk cma heap on dma heap framework.
> 
> The DMA chunk heap supports the bulk allocation of higher order pages.
> 
> The chunk heap's allocator allocates from the CMA area. It is optimized
> to perform bulk allocation of higher order pages in an efficient manner.
> For this purpose, the heap needs an exclusive CMA area that will only be
> used for allocation by the heap. This is the reason why we need to use
> the DT to create and configure a reserved memory region for use by the
> chunk CMA heap driver. Since all allocation from DMA-BUF heaps happen
> from the user-space, there is no other appropriate device-driver that
> we can use to register the chunk CMA heap and configure the reserved
> memory region for its use.

LWN tells me we don't need carve outs any more[1]: "CMA now relies on 
compaction and no longer uses a carved-out memory region."

So why do we need this?

[1] https://lwn.net/Articles/839216/

> 
> Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> Signed-off-by: Hridya Valsaraju <hridya@google.com>
> ---
>  .../reserved-memory/dma_heap_chunk.yaml       | 56 +++++++++++++++++++
>  1 file changed, 56 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/reserved-memory/dma_heap_chunk.yaml
> 
> diff --git a/Documentation/devicetree/bindings/reserved-memory/dma_heap_chunk.yaml b/Documentation/devicetree/bindings/reserved-memory/dma_heap_chunk.yaml
> new file mode 100644
> index 000000000000..00db0ae6af61
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/reserved-memory/dma_heap_chunk.yaml
> @@ -0,0 +1,56 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/reserved-memory/dma_heap_chunk.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Device tree binding for chunk heap on DMA HEAP FRAMEWORK
> +
> +description: |
> +  The DMA chunk heap is backed by the Contiguous Memory Allocator (CMA) and
> +  supports bulk allocation of fixed size pages.
> +
> +maintainers:
> +  - Hyesoo Yu <hyesoo.yu@samsung.com>
> +  - John Stultz <john.stultz@linaro.org>
> +  - Minchan Kim <minchan@kernel.org>
> +  - Hridya Valsaraju<hridya@google.com>
> +
> +
> +properties:
> +  compatible:
> +    enum:
> +      - dma_heap,chunk

Convention is vendor,thing and 'dma_heap' is not a vendor. Also, '-' is 
preferred over '_'.

> +
> +  chunk-order:
> +    description: |
> +            order of pages that will get allocated from the chunk DMA heap.

Page size depends on Linux configuration. And 'order' is very much a 
Linuxism.

> +    maxItems: 1
> +
> +  size:
> +    maxItems: 1
> +
> +  alignment:
> +    maxItems: 1
> +
> +required:
> +  - compatible
> +  - size
> +  - alignment
> +  - chunk-order
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +    reserved-memory {
> +        #address-cells = <2>;
> +        #size-cells = <1>;
> +
> +        chunk_memory: chunk_memory {
> +            compatible = "dma_heap,chunk";
> +            size = <0x3000000>;
> +            alignment = <0x0 0x00010000>;
> +            chunk-order = <4>;
> +        };
> +    };
> -- 
> 2.30.0.296.g2bfb1c46d8-goog
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, back to index

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-21 17:54 [PATCH v4 0/4] Chunk Heap Support on DMA-HEAP Minchan Kim
2021-01-21 17:54 ` [PATCH v4 1/4] mm: cma: introduce gfp flag in cma_alloc instead of no_warn Minchan Kim
2021-01-21 18:46   ` David Hildenbrand
2021-01-21 18:50   ` Minchan Kim
2021-01-25 13:07   ` Michal Hocko
2021-01-25 19:42     ` Minchan Kim
2021-01-26  7:38       ` Michal Hocko
2021-01-26 19:12         ` Minchan Kim
2021-01-27 20:21           ` Minchan Kim
2021-01-21 17:55 ` [PATCH v4 2/4] mm: failfast mode with __GFP_NORETRY in alloc_contig_range Minchan Kim
2021-01-25 13:12   ` Michal Hocko
2021-01-25 13:13     ` Michal Hocko
2021-01-25 19:33     ` Minchan Kim
2021-01-26  7:44       ` Michal Hocko
2021-01-26 19:10         ` Minchan Kim
2021-01-27  8:14           ` Michal Hocko
2021-01-27 20:42         ` Minchan Kim
2021-01-28  7:53           ` Michal Hocko
2021-01-28 16:56             ` Minchan Kim
2021-01-21 17:55 ` [PATCH v4 3/4] dt-bindings: reserved-memory: Make DMA-BUF CMA heap DT-configurable Minchan Kim
2021-01-26  7:07   ` John Stultz
2021-01-27 20:25     ` Hridya Valsaraju
2021-02-05 22:55   ` Rob Herring
2021-01-21 17:55 ` [PATCH v4 4/4] dma-buf: heaps: add chunk heap to dmabuf heaps Minchan Kim
2021-01-26  7:07   ` Christoph Hellwig
2021-01-26 19:27     ` Minchan Kim
2021-01-26  7:32   ` John Stultz
2021-01-26 19:24     ` Minchan Kim
2021-01-26  7:46   ` Michal Hocko
2021-01-26 19:25     ` Minchan Kim
2021-01-27  8:09       ` Michal Hocko

Linux-Devicetree Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-devicetree/0 linux-devicetree/git/0.git
	git clone --mirror https://lore.kernel.org/linux-devicetree/1 linux-devicetree/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-devicetree linux-devicetree/ https://lore.kernel.org/linux-devicetree \
		devicetree@vger.kernel.org
	public-inbox-index linux-devicetree

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-devicetree


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git