[PATCH 0/6] Improve handling of GFP flags in the CMA allocator

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/6] Improve handling of GFP flags in the CMA allocator
@ 2019-02-18 21:07 Gabriel Krisman Bertazi
  2019-02-18 21:07 ` [PATCH 1/6] Revert "kernel/dma: remove unsupported gfp_mask parameter from dma_alloc_from_contiguous()" Gabriel Krisman Bertazi
                   ` (7 more replies)
  0 siblings, 8 replies; 11+ messages in thread
From: Gabriel Krisman Bertazi @ 2019-02-18 21:07 UTC (permalink / raw)
  To: linux-mm
  Cc: labbott, kernel, gael.portay, mike.kravetz, m.szyprowski,
	Gabriel Krisman Bertazi

Hi,

The main goal of this patchset is to solve a deadlock in the CMA
allocator, which happens because cma_alloc tries to sleep waiting for an
IO in the GFP_NOIO path.  This issue, which was reported by Gael Portay
was discussed here:

https://groups.google.com/a/lists.one-eyed-alien.net/forum/#!topic/usb-storage/BXpAsg-G1us

My proposed requires reverting the patches that removed the gfp flags
information from cma_alloc() (patches 1 to 3).  According to the author,
that parameter was removed because it misleads developers about what
cma_alloc actually supports. In his specific case he had problems with
GFP_ZERO.  With that in mind I gave a try at implementing GFP_ZERO in a
quite trivial way in patch 4.  Finally, patches 5 and 6 attempt to fix
the issue by avoiding the unecessary serialization done around
alloc_contig_range.

This is my first adventure in the mm subsystem, so I hope I didn't screw
up something very obvious. I tested this on the workload that was
deadlocking (arm board, with CMA intensive operations from the GPU and
USB), as well as some scripting on top of debugfs.  Is there any
regression test I should be running, which specially applies to the CMA
code?

Gabriel Krisman Bertazi (6):
  Revert "kernel/dma: remove unsupported gfp_mask parameter from
    dma_alloc_from_contiguous()"
  Revert "mm/cma: remove unsupported gfp_mask parameter from
    cma_alloc()"
  cma: Warn about callers requesting unsupported flags
  cma: Add support for GFP_ZERO
  page_isolation: Propagate temporary pageblock isolation error
  cma: Isolate pageblocks speculatively during allocation

 arch/arm/mm/dma-mapping.c                  |  5 +--
 arch/arm64/mm/dma-mapping.c                |  2 +-
 arch/powerpc/kvm/book3s_hv_builtin.c       |  2 +-
 arch/xtensa/kernel/pci-dma.c               |  2 +-
 drivers/iommu/amd_iommu.c                  |  2 +-
 drivers/iommu/intel-iommu.c                |  3 +-
 drivers/s390/char/vmcp.c                   |  2 +-
 drivers/staging/android/ion/ion_cma_heap.c |  2 +-
 include/linux/cma.h                        |  2 +-
 include/linux/dma-contiguous.h             |  4 +-
 kernel/dma/contiguous.c                    |  6 +--
 kernel/dma/direct.c                        |  3 +-
 kernel/dma/remap.c                         |  2 +-
 mm/cma.c                                   | 51 ++++++++++++++++++----
 mm/cma_debug.c                             |  2 +-
 mm/page_isolation.c                        | 20 ++++++---
 16 files changed, 74 insertions(+), 36 deletions(-)

-- 
2.20.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/6] Revert "kernel/dma: remove unsupported gfp_mask parameter from dma_alloc_from_contiguous()"
  2019-02-18 21:07 [PATCH 0/6] Improve handling of GFP flags in the CMA allocator Gabriel Krisman Bertazi
@ 2019-02-18 21:07 ` Gabriel Krisman Bertazi
  2019-02-18 21:07 ` [PATCH 2/6] Revert "mm/cma: remove unsupported gfp_mask parameter from cma_alloc()" Gabriel Krisman Bertazi
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Gabriel Krisman Bertazi @ 2019-02-18 21:07 UTC (permalink / raw)
  To: linux-mm
  Cc: labbott, kernel, gael.portay, mike.kravetz, m.szyprowski,
	Gabriel Krisman Bertazi

This reverts commit d834c5ab83febf9624ad3b16c3c348aa1e02014c.

Commit d834c5ab83fe ("kernel/dma: remove unsupported gfp_mask parameter
from dma_alloc_from_contiguous()") attempts to make more clear that the
CMA allocator doesn't support all of the standard GFP flags by removing
that parameter from cma_alloc().  Unfortunately, this don't make things
much more clear about what CMA supports, as exemplified by the ARM DMA
layer issue, which simply masks away the GFP_NOIO flag when calling the
CMA allocator, letting it assume it is always the hardcoded GFP_KERNEL.

The removal of gfp_flags doesn't make any easier to eventually support
these flags in the CMA allocator, like the rest of this series attempt
to do.  Instead, this patch and the next patch restores that parameter.

CC: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
  [Fix new callers]
---
 arch/arm/mm/dma-mapping.c      | 5 ++---
 arch/arm64/mm/dma-mapping.c    | 2 +-
 arch/xtensa/kernel/pci-dma.c   | 2 +-
 drivers/iommu/amd_iommu.c      | 2 +-
 drivers/iommu/intel-iommu.c    | 3 +--
 include/linux/dma-contiguous.h | 4 ++--
 kernel/dma/contiguous.c        | 7 ++++---
 kernel/dma/direct.c            | 3 +--
 kernel/dma/remap.c             | 2 +-
 9 files changed, 14 insertions(+), 16 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index f1e2922e447c..bc3a62087f52 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -586,7 +586,7 @@ static void *__alloc_from_contiguous(struct device *dev, size_t size,
 	struct page *page;
 	void *ptr = NULL;
 
-	page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN);
+	page = dma_alloc_from_contiguous(dev, count, order, gfp);
 	if (!page)
 		return NULL;
 
@@ -1291,8 +1291,7 @@ static struct page **__iommu_alloc_buffer(struct device *dev, size_t size,
 		unsigned long order = get_order(size);
 		struct page *page;
 
-		page = dma_alloc_from_contiguous(dev, count, order,
-						 gfp & __GFP_NOWARN);
+		page = dma_alloc_from_contiguous(dev, count, order, gfp);
 		if (!page)
 			goto error;
 
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 78c0a72f822c..660adedaab5d 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -159,7 +159,7 @@ static void *__iommu_alloc_attrs(struct device *dev, size_t size,
 		struct page *page;
 
 		page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
-					get_order(size), gfp & __GFP_NOWARN);
+						 get_order(size), gfp);
 		if (!page)
 			return NULL;
 
diff --git a/arch/xtensa/kernel/pci-dma.c b/arch/xtensa/kernel/pci-dma.c
index 9171bff76fc4..e15b893caadb 100644
--- a/arch/xtensa/kernel/pci-dma.c
+++ b/arch/xtensa/kernel/pci-dma.c
@@ -157,7 +157,7 @@ void *arch_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
 
 	if (gfpflags_allow_blocking(flag))
 		page = dma_alloc_from_contiguous(dev, count, get_order(size),
-						 flag & __GFP_NOWARN);
+						 flag);
 
 	if (!page)
 		page = alloc_pages(flag | __GFP_ZERO, get_order(size));
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 2a7b78bb98b4..23346a7a32fc 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2692,7 +2692,7 @@ static void *alloc_coherent(struct device *dev, size_t size,
 			return NULL;
 
 		page = dma_alloc_from_contiguous(dev, size >> PAGE_SHIFT,
-					get_order(size), flag & __GFP_NOWARN);
+						 get_order(size), flag);
 		if (!page)
 			return NULL;
 	}
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 78188bf7e90d..ebaab2d3750f 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3791,8 +3791,7 @@ static void *intel_alloc_coherent(struct device *dev, size_t size,
 	if (gfpflags_allow_blocking(flags)) {
 		unsigned int count = size >> PAGE_SHIFT;
 
-		page = dma_alloc_from_contiguous(dev, count, order,
-						 flags & __GFP_NOWARN);
+		page = dma_alloc_from_contiguous(dev, count, order, flags);
 		if (page && iommu_no_mapping(dev) &&
 		    page_to_phys(page) + size > dev->coherent_dma_mask) {
 			dma_release_from_contiguous(dev, page, count);
diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h
index f247e8aa5e3d..3c5a4cb3eb95 100644
--- a/include/linux/dma-contiguous.h
+++ b/include/linux/dma-contiguous.h
@@ -112,7 +112,7 @@ static inline int dma_declare_contiguous(struct device *dev, phys_addr_t size,
 }
 
 struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
-				       unsigned int order, bool no_warn);
+				       unsigned int order, gfp_t gfp_mask);
 bool dma_release_from_contiguous(struct device *dev, struct page *pages,
 				 int count);
 
@@ -145,7 +145,7 @@ int dma_declare_contiguous(struct device *dev, phys_addr_t size,
 
 static inline
 struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
-				       unsigned int order, bool no_warn)
+				       unsigned int order, gfp_t gfp_mask)
 {
 	return NULL;
 }
diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index b2a87905846d..b1c3109bbd26 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -182,7 +182,7 @@ int __init dma_contiguous_reserve_area(phys_addr_t size, phys_addr_t base,
  * @dev:   Pointer to device for which the allocation is performed.
  * @count: Requested number of pages.
  * @align: Requested alignment of pages (in PAGE_SIZE order).
- * @no_warn: Avoid printing message about failed allocation.
+ * @gfp_mask: GFP flags to use for this allocation.
  *
  * This function allocates memory buffer for specified device. It uses
  * device specific contiguous memory area if available or the default
@@ -190,12 +190,13 @@ int __init dma_contiguous_reserve_area(phys_addr_t size, phys_addr_t base,
  * function.
  */
 struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
-				       unsigned int align, bool no_warn)
+				       unsigned int align, gfp_t gfp_mask)
 {
 	if (align > CONFIG_CMA_ALIGNMENT)
 		align = CONFIG_CMA_ALIGNMENT;
 
-	return cma_alloc(dev_get_cma_area(dev), count, align, no_warn);
+	return cma_alloc(dev_get_cma_area(dev), count, align,
+			 gfp_mask & __GFP_NOWARN);
 }
 
 /**
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 355d16acee6d..6c7009dc9cab 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -111,8 +111,7 @@ struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
 again:
 	/* CMA can be used only in the context which permits sleeping */
 	if (gfpflags_allow_blocking(gfp)) {
-		page = dma_alloc_from_contiguous(dev, count, page_order,
-						 gfp & __GFP_NOWARN);
+		page = dma_alloc_from_contiguous(dev, count, page_order, gfp);
 		if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
 			dma_release_from_contiguous(dev, page, count);
 			page = NULL;
diff --git a/kernel/dma/remap.c b/kernel/dma/remap.c
index 7a723194ecbe..862fc8e781c2 100644
--- a/kernel/dma/remap.c
+++ b/kernel/dma/remap.c
@@ -115,7 +115,7 @@ int __init dma_atomic_pool_init(gfp_t gfp, pgprot_t prot)
 
 	if (dev_get_cma_area(NULL))
 		page = dma_alloc_from_contiguous(NULL, nr_pages,
-						 pool_size_order, false);
+						 pool_size_order, gfp);
 	else
 		page = alloc_pages(gfp, pool_size_order);
 	if (!page)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/6] Revert "mm/cma: remove unsupported gfp_mask parameter from cma_alloc()"
  2019-02-18 21:07 [PATCH 0/6] Improve handling of GFP flags in the CMA allocator Gabriel Krisman Bertazi
  2019-02-18 21:07 ` [PATCH 1/6] Revert "kernel/dma: remove unsupported gfp_mask parameter from dma_alloc_from_contiguous()" Gabriel Krisman Bertazi
@ 2019-02-18 21:07 ` Gabriel Krisman Bertazi
  2019-02-18 21:07 ` [PATCH 3/6] cma: Warn about callers requesting unsupported flags Gabriel Krisman Bertazi
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Gabriel Krisman Bertazi @ 2019-02-18 21:07 UTC (permalink / raw)
  To: linux-mm
  Cc: labbott, kernel, gael.portay, mike.kravetz, m.szyprowski,
	Gabriel Krisman Bertazi

This reverts commit 6518202970c1052148daaef9a8096711775e43a2.

Commit 6518202970c1 ("mm/cma: remove unsupported gfp_mask parameter from
cma_alloc()") attempts to make more clear that the CMA allocator doesn't
support all of the standard GFP flags by removing that parameter from
cma_alloc().  Unfortunately, this don't make things much more clear
about what CMA supports, as exemplified by the ARM DMA layer issue,
which simply masks away the GFP_NOIO flag when calling the CMA
allocator, letting it assume it is always the hardcoded GFP_KERNEL.

The removal of gfp_flags doesn't make any easier to eventually support
these flags in the CMA allocator, like the rest of this series attempt
to do.  Instead, this patch and the previous one restore that parameter.

CC: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
---
 arch/powerpc/kvm/book3s_hv_builtin.c       | 2 +-
 drivers/s390/char/vmcp.c                   | 2 +-
 drivers/staging/android/ion/ion_cma_heap.c | 2 +-
 include/linux/cma.h                        | 2 +-
 kernel/dma/contiguous.c                    | 3 +--
 mm/cma.c                                   | 8 ++++----
 mm/cma_debug.c                             | 2 +-
 7 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_builtin.c b/arch/powerpc/kvm/book3s_hv_builtin.c
index a71e2fc00a4e..925e8f9fc10d 100644
--- a/arch/powerpc/kvm/book3s_hv_builtin.c
+++ b/arch/powerpc/kvm/book3s_hv_builtin.c
@@ -77,7 +77,7 @@ struct page *kvm_alloc_hpt_cma(unsigned long nr_pages)
 	VM_BUG_ON(order_base_2(nr_pages) < KVM_CMA_CHUNK_ORDER - PAGE_SHIFT);
 
 	return cma_alloc(kvm_cma, nr_pages, order_base_2(HPT_ALIGN_PAGES),
-			 false);
+			 GFP_KERNEL);
 }
 EXPORT_SYMBOL_GPL(kvm_alloc_hpt_cma);
 
diff --git a/drivers/s390/char/vmcp.c b/drivers/s390/char/vmcp.c
index 0fa1b6b1491a..948ce82a7725 100644
--- a/drivers/s390/char/vmcp.c
+++ b/drivers/s390/char/vmcp.c
@@ -68,7 +68,7 @@ static void vmcp_response_alloc(struct vmcp_session *session)
 	 * anymore the system won't work anyway.
 	 */
 	if (order > 2)
-		page = cma_alloc(vmcp_cma, nr_pages, 0, false);
+		page = cma_alloc(vmcp_cma, nr_pages, 0, GFP_KERNEL);
 	if (page) {
 		session->response = (char *)page_to_phys(page);
 		session->cma_alloc = 1;
diff --git a/drivers/staging/android/ion/ion_cma_heap.c b/drivers/staging/android/ion/ion_cma_heap.c
index 3fafd013d80a..49718c96bf9e 100644
--- a/drivers/staging/android/ion/ion_cma_heap.c
+++ b/drivers/staging/android/ion/ion_cma_heap.c
@@ -39,7 +39,7 @@ static int ion_cma_allocate(struct ion_heap *heap, struct ion_buffer *buffer,
 	if (align > CONFIG_CMA_ALIGNMENT)
 		align = CONFIG_CMA_ALIGNMENT;
 
-	pages = cma_alloc(cma_heap->cma, nr_pages, align, false);
+	pages = cma_alloc(cma_heap->cma, nr_pages, align, GFP_KERNEL);
 	if (!pages)
 		return -ENOMEM;
 
diff --git a/include/linux/cma.h b/include/linux/cma.h
index 190184b5ff32..bf90f0bb42bd 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -33,7 +33,7 @@ extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
 					const char *name,
 					struct cma **res_cma);
 extern struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
-			      bool no_warn);
+			      gfp_t gfp_mask);
 extern bool cma_release(struct cma *cma, const struct page *pages, unsigned int count);
 
 extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data);
diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index b1c3109bbd26..54a33337680f 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -195,8 +195,7 @@ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
 	if (align > CONFIG_CMA_ALIGNMENT)
 		align = CONFIG_CMA_ALIGNMENT;
 
-	return cma_alloc(dev_get_cma_area(dev), count, align,
-			 gfp_mask & __GFP_NOWARN);
+	return cma_alloc(dev_get_cma_area(dev), count, align, gfp_mask);
 }
 
 /**
diff --git a/mm/cma.c b/mm/cma.c
index c7b39dd3b4f6..fdad7ad0d9c4 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -395,13 +395,13 @@ static inline void cma_debug_show_areas(struct cma *cma) { }
  * @cma:   Contiguous memory region for which the allocation is performed.
  * @count: Requested number of pages.
  * @align: Requested alignment of pages (in PAGE_SIZE order).
- * @no_warn: Avoid printing message about failed allocation
+ * @gfp_mask:  GFP mask to use during compaction
  *
  * This function allocates part of contiguous memory on specific
  * contiguous memory area.
  */
 struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
-		       bool no_warn)
+		       gfp_t gfp_mask)
 {
 	unsigned long mask, offset;
 	unsigned long pfn = -1;
@@ -448,7 +448,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 		pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
 		mutex_lock(&cma_mutex);
 		ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA,
-				     GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0));
+					 gfp_mask);
 		mutex_unlock(&cma_mutex);
 		if (ret == 0) {
 			page = pfn_to_page(pfn);
@@ -477,7 +477,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 			page_kasan_tag_reset(page + i);
 	}
 
-	if (ret && !no_warn) {
+	if (ret && !(gfp_mask & __GFP_NOWARN)) {
 		pr_err("%s: alloc failed, req-size: %zu pages, ret: %d\n",
 			__func__, count, ret);
 		cma_debug_show_areas(cma);
diff --git a/mm/cma_debug.c b/mm/cma_debug.c
index ad6723e9d110..f23467291cfb 100644
--- a/mm/cma_debug.c
+++ b/mm/cma_debug.c
@@ -139,7 +139,7 @@ static int cma_alloc_mem(struct cma *cma, int count)
 	if (!mem)
 		return -ENOMEM;
 
-	p = cma_alloc(cma, count, 0, false);
+	p = cma_alloc(cma, count, 0, GFP_KERNEL);
 	if (!p) {
 		kfree(mem);
 		return -ENOMEM;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/6] cma: Warn about callers requesting unsupported flags
  2019-02-18 21:07 [PATCH 0/6] Improve handling of GFP flags in the CMA allocator Gabriel Krisman Bertazi
  2019-02-18 21:07 ` [PATCH 1/6] Revert "kernel/dma: remove unsupported gfp_mask parameter from dma_alloc_from_contiguous()" Gabriel Krisman Bertazi
  2019-02-18 21:07 ` [PATCH 2/6] Revert "mm/cma: remove unsupported gfp_mask parameter from cma_alloc()" Gabriel Krisman Bertazi
@ 2019-02-18 21:07 ` Gabriel Krisman Bertazi
  2019-02-18 21:07 ` [PATCH 4/6] cma: Add support for GFP_ZERO Gabriel Krisman Bertazi
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Gabriel Krisman Bertazi @ 2019-02-18 21:07 UTC (permalink / raw)
  To: linux-mm
  Cc: labbott, kernel, gael.portay, mike.kravetz, m.szyprowski,
	Gabriel Krisman Bertazi

The CMA allocator does not support every standard GFP flag.  Instead of
silently ignoring, we should be noisy about it, to catch wrong code
paths early.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
---
 mm/cma.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/mm/cma.c b/mm/cma.c
index fdad7ad0d9c4..5789e3545faf 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -411,6 +411,10 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 	struct page *page = NULL;
 	int ret = -ENOMEM;
 
+	/* Be noisy about caller asking for unsupported flags. */
+	WARN_ON(unlikely(!(gfp_mask & __GFP_DIRECT_RECLAIM) ||
+			 (gfp_mask & (__GFP_ZERO|__GFP_NOFAIL))));
+
 	if (!cma || !cma->count)
 		return NULL;
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/6] cma: Add support for GFP_ZERO
  2019-02-18 21:07 [PATCH 0/6] Improve handling of GFP flags in the CMA allocator Gabriel Krisman Bertazi
                   ` (2 preceding siblings ...)
  2019-02-18 21:07 ` [PATCH 3/6] cma: Warn about callers requesting unsupported flags Gabriel Krisman Bertazi
@ 2019-02-18 21:07 ` Gabriel Krisman Bertazi
  2019-02-18 21:07 ` [PATCH 5/6] page_isolation: Propagate temporary pageblock isolation error Gabriel Krisman Bertazi
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Gabriel Krisman Bertazi @ 2019-02-18 21:07 UTC (permalink / raw)
  To: linux-mm
  Cc: labbott, kernel, gael.portay, mike.kravetz, m.szyprowski,
	Gabriel Krisman Bertazi

Since cma_alloc now has gfp_mask, make it honor GFP_ZERO, to not suprise
potential users.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
---
 mm/cma.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/mm/cma.c b/mm/cma.c
index 5789e3545faf..1dff74b1a8c5 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -408,12 +408,13 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 	unsigned long start = 0;
 	unsigned long bitmap_maxno, bitmap_no, bitmap_count;
 	size_t i;
+	void *kaddr;
 	struct page *page = NULL;
 	int ret = -ENOMEM;
 
 	/* Be noisy about caller asking for unsupported flags. */
 	WARN_ON(unlikely(!(gfp_mask & __GFP_DIRECT_RECLAIM) ||
-			 (gfp_mask & (__GFP_ZERO|__GFP_NOFAIL))));
+			 (gfp_mask & __GFP_NOFAIL)));
 
 	if (!cma || !cma->count)
 		return NULL;
@@ -477,8 +478,15 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 	 * those page blocks.
 	 */
 	if (page) {
-		for (i = 0; i < count; i++)
+		for (i = 0; i < count; i++) {
 			page_kasan_tag_reset(page + i);
+
+			if (gfp_mask & __GFP_ZERO) {
+				kaddr = kmap_atomic(page + i);
+				clear_page(kaddr);
+				kunmap_atomic(kaddr);
+			}
+		}
 	}
 
 	if (ret && !(gfp_mask & __GFP_NOWARN)) {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/6] page_isolation: Propagate temporary pageblock isolation error
  2019-02-18 21:07 [PATCH 0/6] Improve handling of GFP flags in the CMA allocator Gabriel Krisman Bertazi
                   ` (3 preceding siblings ...)
  2019-02-18 21:07 ` [PATCH 4/6] cma: Add support for GFP_ZERO Gabriel Krisman Bertazi
@ 2019-02-18 21:07 ` Gabriel Krisman Bertazi
  2019-02-18 21:07 ` [PATCH 6/6] cma: Isolate pageblocks speculatively during allocation Gabriel Krisman Bertazi
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Gabriel Krisman Bertazi @ 2019-02-18 21:07 UTC (permalink / raw)
  To: linux-mm
  Cc: labbott, kernel, gael.portay, mike.kravetz, m.szyprowski,
	Gabriel Krisman Bertazi

If the page isolation failed because of racing the setup of the pages
migrate type, it is very likely that a further attempt will succeed. In
this case, instead of -EBUSY, return -EAGAIN, to let callers handle this
condition properly.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
---
 mm/page_isolation.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index ce323e56b34d..a8169d8ea02d 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -30,10 +30,12 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
 	/*
 	 * We assume the caller intended to SET migrate type to isolate.
 	 * If it is already set, then someone else must have raced and
-	 * set it before us.  Return -EBUSY
+	 * set it before us.  Return -EAGAIN
 	 */
-	if (is_migrate_isolate_page(page))
+	if (is_migrate_isolate_page(page)) {
+		ret = -EAGAIN;
 		goto out;
+	}
 
 	pfn = page_to_pfn(page);
 	arg.start_pfn = pfn;
@@ -188,6 +190,7 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
 	unsigned long pfn;
 	unsigned long undo_pfn;
 	struct page *page;
+	int ret = -EBUSY;
 
 	BUG_ON(!IS_ALIGNED(start_pfn, pageblock_nr_pages));
 	BUG_ON(!IS_ALIGNED(end_pfn, pageblock_nr_pages));
@@ -196,10 +199,13 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
 	     pfn < end_pfn;
 	     pfn += pageblock_nr_pages) {
 		page = __first_valid_page(pfn, pageblock_nr_pages);
-		if (page &&
-		    set_migratetype_isolate(page, migratetype, flags)) {
-			undo_pfn = pfn;
-			goto undo;
+		if (page) {
+			ret = set_migratetype_isolate(page, migratetype,
+						      flags);
+			if (ret) {
+				undo_pfn = pfn;
+				goto undo;
+			}
 		}
 	}
 	return 0;
@@ -213,7 +219,7 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
 		unset_migratetype_isolate(page, migratetype);
 	}
 
-	return -EBUSY;
+	return ret;
 }
 
 /*
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 6/6] cma: Isolate pageblocks speculatively during allocation
  2019-02-18 21:07 [PATCH 0/6] Improve handling of GFP flags in the CMA allocator Gabriel Krisman Bertazi
                   ` (4 preceding siblings ...)
  2019-02-18 21:07 ` [PATCH 5/6] page_isolation: Propagate temporary pageblock isolation error Gabriel Krisman Bertazi
@ 2019-02-18 21:07 ` Gabriel Krisman Bertazi
  2019-02-21 12:39 ` [PATCH 0/6] Improve handling of GFP flags in the CMA allocator Vlastimil Babka
  2019-02-26 14:29 ` Christoph Hellwig
  7 siblings, 0 replies; 11+ messages in thread
From: Gabriel Krisman Bertazi @ 2019-02-18 21:07 UTC (permalink / raw)
  To: linux-mm
  Cc: labbott, kernel, gael.portay, mike.kravetz, m.szyprowski,
	Gabriel Krisman Bertazi

Holding the mutex when calling alloc_contig_range() is not essential
because of the bitmap reservation plus the implicit synchronization
mechanism inside start_isolate_page_range(), which prevents allocations
on the same pageblock.  It is still beneficial to perform some kind of
serialization on this path, though, to allow allocations on the same
pageblock, if possible, instead of immediately jumping to another
allocatable region.

Therefore, this patch, instead of serializing every CMA allocation,
speculatively try to do the allocation without acquiring the mutex.  If
we race with another thread allocating on the same pageblock, we can
retry on the same region, after waiting for the other colliding
allocations to finish.

The synchronization of aborted tasks is still done globaly for the CMA
allocator.  Ideally, the aborted allocation would wait only for the
migration of the colliding pageblock, but there is no easy way to track
each pageblock isolation in a non-racy way without adding more code
overhead.  Thus, I believe the mutex mechanism to be an acceptable
compromise, if it is not violating the mutex semantics too much.

Finally, some code paths like the writeback case, should not blindly
sleep waiting for the mutex, because of the possibility of deadlocking
if it is a dependency of another allocation thread that holds the mutex.
This exact scenario was observed by Gael Portay, with a GPU thread that
allocs CMA triggering a writeback, and a USB device in the ARM device
that tries to satisfy the writeback with a NOIO CMA allocation [1].  For
that reason, we restrict writeback threads from waiting on the
pageblock, and instead, we let them move on to a readily available
contiguous memory region, effectively preventing the issue reported in
[1].

[1] https://groups.google.com/a/lists.one-eyed-alien.net/forum/#!topic/usb-storage/BXpAsg-G1us

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
---
 mm/cma.c | 29 ++++++++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)

diff --git a/mm/cma.c b/mm/cma.c
index 1dff74b1a8c5..ace978623b8d 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -411,6 +411,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 	void *kaddr;
 	struct page *page = NULL;
 	int ret = -ENOMEM;
+	bool has_lock = false;

 	/* Be noisy about caller asking for unsupported flags. */
 	WARN_ON(unlikely(!(gfp_mask & __GFP_DIRECT_RECLAIM) ||
@@ -451,17 +452,39 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 		mutex_unlock(&cma->lock);

 		pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
-		mutex_lock(&cma_mutex);
+
+		/* Mutual exclusion inside alloc_contig_range() is not
+		 * strictly necessary, but it makes the allocation a
+		 * little more likely to succeed, because it serializes
+		 * simultaneous allocations on the same pageblock.  We
+		 * cannot sleep on all paths, though, so try to do the
+		 * allocation speculatively, if we identify another
+		 * thread using the same pageblock, fallback to the
+		 * serial path mutex, if possible, or try another
+		 * pageblock, otherwise.
+		 */
+		has_lock = mutex_trylock(&cma_mutex);
+retry:
 		ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA,
 					 gfp_mask);
-		mutex_unlock(&cma_mutex);
+
+		if (ret == -EAGAIN && (gfp_mask & __GFP_IO)) {
+			if (!has_lock) {
+				mutex_lock(&cma_mutex);
+				has_lock = true;
+			}
+			goto retry;
+		}
+		if (has_lock)
+			mutex_unlock(&cma_mutex);
+
 		if (ret == 0) {
 			page = pfn_to_page(pfn);
 			break;
 		}

 		cma_clear_bitmap(cma, pfn, count);
-		if (ret != -EBUSY)
+		if (ret != -EBUSY && ret != -EAGAIN)
 			break;

 		pr_debug("%s(): memory range at %p is busy, retrying\n",
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/6] Improve handling of GFP flags in the CMA allocator
  2019-02-18 21:07 [PATCH 0/6] Improve handling of GFP flags in the CMA allocator Gabriel Krisman Bertazi
                   ` (5 preceding siblings ...)
  2019-02-18 21:07 ` [PATCH 6/6] cma: Isolate pageblocks speculatively during allocation Gabriel Krisman Bertazi
@ 2019-02-21 12:39 ` Vlastimil Babka
  2019-02-26 14:29 ` Christoph Hellwig
  7 siblings, 0 replies; 11+ messages in thread
From: Vlastimil Babka @ 2019-02-21 12:39 UTC (permalink / raw)
  To: Gabriel Krisman Bertazi, linux-mm
  Cc: labbott, kernel, gael.portay, mike.kravetz, m.szyprowski, Michal Hocko

On 2/18/19 10:07 PM, Gabriel Krisman Bertazi wrote:
> Hi,
> 
> The main goal of this patchset is to solve a deadlock in the CMA
> allocator, which happens because cma_alloc tries to sleep waiting for an
> IO in the GFP_NOIO path.  This issue, which was reported by Gael Portay
> was discussed here:
> 
> https://groups.google.com/a/lists.one-eyed-alien.net/forum/#!topic/usb-storage/BXpAsg-G1us
> 
> My proposed requires reverting the patches that removed the gfp flags
> information from cma_alloc() (patches 1 to 3).  According to the author,
> that parameter was removed because it misleads developers about what
> cma_alloc actually supports. In his specific case he had problems with
> GFP_ZERO.  With that in mind I gave a try at implementing GFP_ZERO in a
> quite trivial way in patch 4.  Finally, patches 5 and 6 attempt to fix
> the issue by avoiding the unecessary serialization done around
> alloc_contig_range.

I haven't checked in detail yet, but for GFP_NOIO, we have
memalloc_noio_save() / memalloc_noio_restore() which adds implicit
GFP_NOIO for the whole call stack. So that could be perhaps used to
avoid adding the gfp flags back to function signatures. Since you are
adding a new test for __GFP_IO in cma_alloc() in patch 6, you would use
e.g. current_gfp_context(GFP_KERNEL) first to add __GFP_NOIO based on
the implicit context. As for the arm64 caller, maybe it already is in
noio context (ideal world), or would add it based on test before calling
dma_alloc_from_contiguous(). There's also some documentation in
Documentation/core-api/gfp_mask-from-fs-io.rst
CCing Michal for opinion since he authored this

> This is my first adventure in the mm subsystem, so I hope I didn't screw
> up something very obvious. I tested this on the workload that was
> deadlocking (arm board, with CMA intensive operations from the GPU and
> USB), as well as some scripting on top of debugfs.  Is there any
> regression test I should be running, which specially applies to the CMA
> code?
> 
> 
> Gabriel Krisman Bertazi (6):
>   Revert "kernel/dma: remove unsupported gfp_mask parameter from
>     dma_alloc_from_contiguous()"
>   Revert "mm/cma: remove unsupported gfp_mask parameter from
>     cma_alloc()"
>   cma: Warn about callers requesting unsupported flags
>   cma: Add support for GFP_ZERO
>   page_isolation: Propagate temporary pageblock isolation error
>   cma: Isolate pageblocks speculatively during allocation
> 
>  arch/arm/mm/dma-mapping.c                  |  5 +--
>  arch/arm64/mm/dma-mapping.c                |  2 +-
>  arch/powerpc/kvm/book3s_hv_builtin.c       |  2 +-
>  arch/xtensa/kernel/pci-dma.c               |  2 +-
>  drivers/iommu/amd_iommu.c                  |  2 +-
>  drivers/iommu/intel-iommu.c                |  3 +-
>  drivers/s390/char/vmcp.c                   |  2 +-
>  drivers/staging/android/ion/ion_cma_heap.c |  2 +-
>  include/linux/cma.h                        |  2 +-
>  include/linux/dma-contiguous.h             |  4 +-
>  kernel/dma/contiguous.c                    |  6 +--
>  kernel/dma/direct.c                        |  3 +-
>  kernel/dma/remap.c                         |  2 +-
>  mm/cma.c                                   | 51 ++++++++++++++++++----
>  mm/cma_debug.c                             |  2 +-
>  mm/page_isolation.c                        | 20 ++++++---
>  16 files changed, 74 insertions(+), 36 deletions(-)
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/6] Improve handling of GFP flags in the CMA allocator
  2019-02-18 21:07 [PATCH 0/6] Improve handling of GFP flags in the CMA allocator Gabriel Krisman Bertazi
                   ` (6 preceding siblings ...)
  2019-02-21 12:39 ` [PATCH 0/6] Improve handling of GFP flags in the CMA allocator Vlastimil Babka
@ 2019-02-26 14:29 ` Christoph Hellwig
  2019-02-28  0:12   ` Laura Abbott
  7 siblings, 1 reply; 11+ messages in thread
From: Christoph Hellwig @ 2019-02-26 14:29 UTC (permalink / raw)
  To: Gabriel Krisman Bertazi
  Cc: linux-mm, labbott, kernel, gael.portay, mike.kravetz, m.szyprowski

I don't think this is a good idea.  The whole concept of just passing
random GFP_ flags to dma_alloc_attrs / dma_alloc_coherent can't work,
given that on many architectures we need to set up new page tables
to remap the allocated memory, and we can't use arbitrary gfp flags
for pte allocations.

So instead of trying to pass them further down again we need to instead
work to fix all callers of dma_alloc_attrs / dma_alloc_coherent
that don't just pass GFP_KERNEL.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/6] Improve handling of GFP flags in the CMA allocator
  2019-02-26 14:29 ` Christoph Hellwig
@ 2019-02-28  0:12   ` Laura Abbott
  2019-02-28  8:46     ` Michal Hocko
  0 siblings, 1 reply; 11+ messages in thread
From: Laura Abbott @ 2019-02-28  0:12 UTC (permalink / raw)
  To: Christoph Hellwig, Gabriel Krisman Bertazi
  Cc: linux-mm, kernel, gael.portay, mike.kravetz, m.szyprowski

On 2/26/19 6:29 AM, Christoph Hellwig wrote:
> I don't think this is a good idea.  The whole concept of just passing
> random GFP_ flags to dma_alloc_attrs / dma_alloc_coherent can't work,
> given that on many architectures we need to set up new page tables
> to remap the allocated memory, and we can't use arbitrary gfp flags
> for pte allocations.
> 
> So instead of trying to pass them further down again we need to instead
> work to fix all callers of dma_alloc_attrs / dma_alloc_coherent
> that don't just pass GFP_KERNEL.
> 

What's the expected approach to fix callers? It's not clear how
you would fix the callers for the case that prompted this series
(context correctly used GFP_NOIO but it was not passed to
dma_alloc_coherent)

Thanks,
Laura


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/6] Improve handling of GFP flags in the CMA allocator
  2019-02-28  0:12   ` Laura Abbott
@ 2019-02-28  8:46     ` Michal Hocko
  0 siblings, 0 replies; 11+ messages in thread
From: Michal Hocko @ 2019-02-28  8:46 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Christoph Hellwig, Gabriel Krisman Bertazi, linux-mm, kernel,
	gael.portay, mike.kravetz, m.szyprowski

On Wed 27-02-19 16:12:30, Laura Abbott wrote:
> On 2/26/19 6:29 AM, Christoph Hellwig wrote:
> > I don't think this is a good idea.  The whole concept of just passing
> > random GFP_ flags to dma_alloc_attrs / dma_alloc_coherent can't work,
> > given that on many architectures we need to set up new page tables
> > to remap the allocated memory, and we can't use arbitrary gfp flags
> > for pte allocations.
> > 
> > So instead of trying to pass them further down again we need to instead
> > work to fix all callers of dma_alloc_attrs / dma_alloc_coherent
> > that don't just pass GFP_KERNEL.
> > 
> 
> What's the expected approach to fix callers? It's not clear how
> you would fix the callers for the case that prompted this series
> (context correctly used GFP_NOIO but it was not passed to
> dma_alloc_coherent)

Use the scope API (memalloc_noio_{save,restore}) at the scope boundary
(lock or other restriction that prohibids IO path recursion) and you do
not have to care about the specific allocation because it will inherit
GFP_IO automagically.

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-02-28  8:46 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-18 21:07 [PATCH 0/6] Improve handling of GFP flags in the CMA allocator Gabriel Krisman Bertazi
2019-02-18 21:07 ` [PATCH 1/6] Revert "kernel/dma: remove unsupported gfp_mask parameter from dma_alloc_from_contiguous()" Gabriel Krisman Bertazi
2019-02-18 21:07 ` [PATCH 2/6] Revert "mm/cma: remove unsupported gfp_mask parameter from cma_alloc()" Gabriel Krisman Bertazi
2019-02-18 21:07 ` [PATCH 3/6] cma: Warn about callers requesting unsupported flags Gabriel Krisman Bertazi
2019-02-18 21:07 ` [PATCH 4/6] cma: Add support for GFP_ZERO Gabriel Krisman Bertazi
2019-02-18 21:07 ` [PATCH 5/6] page_isolation: Propagate temporary pageblock isolation error Gabriel Krisman Bertazi
2019-02-18 21:07 ` [PATCH 6/6] cma: Isolate pageblocks speculatively during allocation Gabriel Krisman Bertazi
2019-02-21 12:39 ` [PATCH 0/6] Improve handling of GFP flags in the CMA allocator Vlastimil Babka
2019-02-26 14:29 ` Christoph Hellwig
2019-02-28  0:12   ` Laura Abbott
2019-02-28  8:46     ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).