linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking
@ 2020-10-16 22:52 Roman Gushchin
  2020-10-16 22:52 ` [PATCH rfc 1/2] " Roman Gushchin
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Roman Gushchin @ 2020-10-16 22:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Zi Yan, Joonsoo Kim, Mike Kravetz, linux-kernel, linux-mm,
	kernel-team, Roman Gushchin

This small patchset makes cma_release() non-blocking and simplifies
the code in hugetlbfs, where previously we had to temporarily drop
hugetlb_lock around the cma_release() call.

It should help Zi Yan on his work on 1 GB THPs: splitting a gigantic
THP under a memory pressure requires a cma_release() call. If it's
a blocking function, it complicates the already complicated code.
Because there are at least two use cases like this (hugetlbfs is
another example), I believe it's just better to make cma_release()
non-blocking.

It also makes it more consistent with other memory releasing functions
in the kernel: most of them are non-blocking.


Roman Gushchin (2):
  mm: cma: make cma_release() non-blocking
  mm: hugetlb: don't drop hugetlb_lock around cma_release() call

 mm/cma.c     | 51 +++++++++++++++++++++++++++++++++++++++++++++++++--
 mm/hugetlb.c |  6 ------
 2 files changed, 49 insertions(+), 8 deletions(-)

-- 
2.26.2



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH rfc 1/2] mm: cma: make cma_release() non-blocking
  2020-10-16 22:52 [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking Roman Gushchin
@ 2020-10-16 22:52 ` Roman Gushchin
  2020-10-16 22:52 ` [PATCH rfc 2/2] mm: hugetlb: don't drop hugetlb_lock around cma_release() call Roman Gushchin
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Roman Gushchin @ 2020-10-16 22:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Zi Yan, Joonsoo Kim, Mike Kravetz, linux-kernel, linux-mm,
	kernel-team, Roman Gushchin

cma_release() has to lock the cma_lock mutex to clear the cma bitmap.
It makes it a blocking function, which complicates its usage from
non-blocking contexts. For instance, hugetlbfs code is temporarily
dropping the hugetlb_lock spinlock to call cma_release().

This patch makes cma_release non-blocking by postponing the cma
bitmap clearance. It's done later from a work context. The first page
in the cma allocation is used to store the work struct.

To make sure that subsequent cma_alloc() call will pass, cma_alloc()
flushes the corresponding workqueue.

Because CMA allocations and de-allocations are usually not that
frequent, a single global workqueue is used.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 mm/cma.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 49 insertions(+), 2 deletions(-)

diff --git a/mm/cma.c b/mm/cma.c
index 7f415d7cda9f..523cd9356bc7 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -36,10 +36,19 @@
 
 #include "cma.h"
 
+struct cma_clear_bitmap_work {
+	struct work_struct work;
+	struct cma *cma;
+	unsigned long pfn;
+	unsigned int count;
+};
+
 struct cma cma_areas[MAX_CMA_AREAS];
 unsigned cma_area_count;
 static DEFINE_MUTEX(cma_mutex);
 
+struct workqueue_struct *cma_release_wq;
+
 phys_addr_t cma_get_base(const struct cma *cma)
 {
 	return PFN_PHYS(cma->base_pfn);
@@ -148,6 +157,10 @@ static int __init cma_init_reserved_areas(void)
 	for (i = 0; i < cma_area_count; i++)
 		cma_activate_area(&cma_areas[i]);
 
+	cma_release_wq = create_workqueue("cma_release");
+	if (!cma_release_wq)
+		return -ENOMEM;
+
 	return 0;
 }
 core_initcall(cma_init_reserved_areas);
@@ -437,6 +450,13 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 		return NULL;
 
 	for (;;) {
+		/*
+		 * CMA bitmaps are cleared asynchronously from works,
+		 * scheduled by cma_release(). To make sure the allocation
+		 * will success, cma release workqueue is flushed here.
+		 */
+		flush_workqueue(cma_release_wq);
+
 		mutex_lock(&cma->lock);
 		bitmap_no = bitmap_find_next_zero_area_off(cma->bitmap,
 				bitmap_maxno, start, bitmap_count, mask,
@@ -495,6 +515,17 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
 	return page;
 }
 
+static void cma_clear_bitmap_fn(struct work_struct *work)
+{
+	struct cma_clear_bitmap_work *w;
+
+	w = container_of(work, struct cma_clear_bitmap_work, work);
+
+	cma_clear_bitmap(w->cma, w->pfn, w->count);
+
+	__free_page(pfn_to_page(w->pfn));
+}
+
 /**
  * cma_release() - release allocated pages
  * @cma:   Contiguous memory region for which the allocation is performed.
@@ -507,6 +538,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
  */
 bool cma_release(struct cma *cma, const struct page *pages, unsigned int count)
 {
+	struct cma_clear_bitmap_work *work;
 	unsigned long pfn;
 
 	if (!cma || !pages)
@@ -521,8 +553,23 @@ bool cma_release(struct cma *cma, const struct page *pages, unsigned int count)
 
 	VM_BUG_ON(pfn + count > cma->base_pfn + cma->count);
 
-	free_contig_range(pfn, count);
-	cma_clear_bitmap(cma, pfn, count);
+	/*
+	 * To make cma_release() non-blocking, cma bitmap is cleared from
+	 * a work context (see cma_clear_bitmap_fn()). The first page
+	 * in the cma allocation is used to store the work structure,
+	 * so it's released after the cma bitmap clearance. Other pages
+	 * are released immediately as previously.
+	 */
+	if (count > 1)
+		free_contig_range(pfn + 1, count - 1);
+
+	work = (struct cma_clear_bitmap_work *)page_to_virt(pages);
+	INIT_WORK(&work->work, cma_clear_bitmap_fn);
+	work->cma = cma;
+	work->pfn = pfn;
+	work->count = count;
+	queue_work(cma_release_wq, &work->work);
+
 	trace_cma_release(pfn, pages, count);
 
 	return true;
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH rfc 2/2] mm: hugetlb: don't drop hugetlb_lock around cma_release() call
  2020-10-16 22:52 [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking Roman Gushchin
  2020-10-16 22:52 ` [PATCH rfc 1/2] " Roman Gushchin
@ 2020-10-16 22:52 ` Roman Gushchin
  2020-10-22  0:15 ` [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking Mike Kravetz
  2020-10-22  1:54 ` Xiaqing (A)
  3 siblings, 0 replies; 11+ messages in thread
From: Roman Gushchin @ 2020-10-16 22:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Zi Yan, Joonsoo Kim, Mike Kravetz, linux-kernel, linux-mm,
	kernel-team, Roman Gushchin

cma_release() now is a non-blocking function, so there is no more need
to drop hugetlb_lock to call it.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 mm/hugetlb.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index fe76f8fd5a73..c8c892b1cabc 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1312,14 +1312,8 @@ static void update_and_free_page(struct hstate *h, struct page *page)
 	set_compound_page_dtor(page, NULL_COMPOUND_DTOR);
 	set_page_refcounted(page);
 	if (hstate_is_gigantic(h)) {
-		/*
-		 * Temporarily drop the hugetlb_lock, because
-		 * we might block in free_gigantic_page().
-		 */
-		spin_unlock(&hugetlb_lock);
 		destroy_compound_gigantic_page(page, huge_page_order(h));
 		free_gigantic_page(page, huge_page_order(h));
-		spin_lock(&hugetlb_lock);
 	} else {
 		__free_pages(page, huge_page_order(h));
 	}
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking
  2020-10-16 22:52 [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking Roman Gushchin
  2020-10-16 22:52 ` [PATCH rfc 1/2] " Roman Gushchin
  2020-10-16 22:52 ` [PATCH rfc 2/2] mm: hugetlb: don't drop hugetlb_lock around cma_release() call Roman Gushchin
@ 2020-10-22  0:15 ` Mike Kravetz
  2020-10-22  2:33   ` Roman Gushchin
  2020-10-22  1:54 ` Xiaqing (A)
  3 siblings, 1 reply; 11+ messages in thread
From: Mike Kravetz @ 2020-10-22  0:15 UTC (permalink / raw)
  To: Roman Gushchin, Andrew Morton
  Cc: Zi Yan, Joonsoo Kim, linux-kernel, linux-mm, kernel-team

On 10/16/20 3:52 PM, Roman Gushchin wrote:
> This small patchset makes cma_release() non-blocking and simplifies
> the code in hugetlbfs, where previously we had to temporarily drop
> hugetlb_lock around the cma_release() call.
> 
> It should help Zi Yan on his work on 1 GB THPs: splitting a gigantic
> THP under a memory pressure requires a cma_release() call. If it's
> a blocking function, it complicates the already complicated code.
> Because there are at least two use cases like this (hugetlbfs is
> another example), I believe it's just better to make cma_release()
> non-blocking.
> 
> It also makes it more consistent with other memory releasing functions
> in the kernel: most of them are non-blocking.

Thanks for looking into this Roman.

I may be missing something, but why does cma_release have to be blocking
today?  Certainly, it takes the bitmap in cma_clear_bitmap and could
block.  However, I do not see why cma->lock has to be a mutex.  I may be
missing something, but I do not see any code protected by the mutex doing
anything that could sleep?

Could we simply change that mutex to a spinlock?
-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking
  2020-10-16 22:52 [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking Roman Gushchin
                   ` (2 preceding siblings ...)
  2020-10-22  0:15 ` [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking Mike Kravetz
@ 2020-10-22  1:54 ` Xiaqing (A)
  2020-10-22  2:45   ` Roman Gushchin
  3 siblings, 1 reply; 11+ messages in thread
From: Xiaqing (A) @ 2020-10-22  1:54 UTC (permalink / raw)
  To: Roman Gushchin, Andrew Morton
  Cc: Zi Yan, Joonsoo Kim, Mike Kravetz, linux-kernel, linux-mm, kernel-team



On 2020/10/17 6:52, Roman Gushchin wrote:

> This small patchset makes cma_release() non-blocking and simplifies
> the code in hugetlbfs, where previously we had to temporarily drop
> hugetlb_lock around the cma_release() call.
>
> It should help Zi Yan on his work on 1 GB THPs: splitting a gigantic
> THP under a memory pressure requires a cma_release() call. If it's
> a blocking function, it complicates the already complicated code.
> Because there are at least two use cases like this (hugetlbfs is
> another example), I believe it's just better to make cma_release()
> non-blocking.
>
> It also makes it more consistent with other memory releasing functions
> in the kernel: most of them are non-blocking.
>
>
> Roman Gushchin (2):
>    mm: cma: make cma_release() non-blocking
>    mm: hugetlb: don't drop hugetlb_lock around cma_release() call
>
>   mm/cma.c     | 51 +++++++++++++++++++++++++++++++++++++++++++++++++--
>   mm/hugetlb.c |  6 ------
>   2 files changed, 49 insertions(+), 8 deletions(-)
>
I don't think this patch is a good idea.It transfers part or even all of the time of
cma_release to cma_alloc, which is more concerned by performance indicators.
On Android phones, CPU resource competition is intense in many scenarios,
As a result, kernel threads and workers can be scheduled only after some ticks or more.
In this case, the performance of cma_alloc will deteriorate significantly,
which is not good news for many services on Android.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking
  2020-10-22  0:15 ` [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking Mike Kravetz
@ 2020-10-22  2:33   ` Roman Gushchin
  2020-10-22 16:42     ` Mike Kravetz
  0 siblings, 1 reply; 11+ messages in thread
From: Roman Gushchin @ 2020-10-22  2:33 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: Andrew Morton, Zi Yan, Joonsoo Kim, linux-kernel, linux-mm, kernel-team

On Wed, Oct 21, 2020 at 05:15:53PM -0700, Mike Kravetz wrote:
> On 10/16/20 3:52 PM, Roman Gushchin wrote:
> > This small patchset makes cma_release() non-blocking and simplifies
> > the code in hugetlbfs, where previously we had to temporarily drop
> > hugetlb_lock around the cma_release() call.
> > 
> > It should help Zi Yan on his work on 1 GB THPs: splitting a gigantic
> > THP under a memory pressure requires a cma_release() call. If it's
> > a blocking function, it complicates the already complicated code.
> > Because there are at least two use cases like this (hugetlbfs is
> > another example), I believe it's just better to make cma_release()
> > non-blocking.
> > 
> > It also makes it more consistent with other memory releasing functions
> > in the kernel: most of them are non-blocking.
> 
> Thanks for looking into this Roman.

Hi Mike,

> 
> I may be missing something, but why does cma_release have to be blocking
> today?  Certainly, it takes the bitmap in cma_clear_bitmap and could
> block.  However, I do not see why cma->lock has to be a mutex.  I may be
> missing something, but I do not see any code protected by the mutex doing
> anything that could sleep?
> 
> Could we simply change that mutex to a spinlock?

I actually have suggested it few months ago, but the idea was
opposed by Joonsoo: https://lkml.org/lkml/2020/4/3/12 .

The time of a bitmap operation is definitely not an issue in my context,
but I can't speak for something like an embedded/rt case.

Thanks!


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking
  2020-10-22  1:54 ` Xiaqing (A)
@ 2020-10-22  2:45   ` Roman Gushchin
  2020-10-22  3:47     ` Xiaqing (A)
  0 siblings, 1 reply; 11+ messages in thread
From: Roman Gushchin @ 2020-10-22  2:45 UTC (permalink / raw)
  To: Xiaqing (A)
  Cc: Andrew Morton, Zi Yan, Joonsoo Kim, Mike Kravetz, linux-kernel,
	linux-mm, kernel-team

On Thu, Oct 22, 2020 at 09:54:53AM +0800, Xiaqing (A) wrote:
> 
> 
> On 2020/10/17 6:52, Roman Gushchin wrote:
> 
> > This small patchset makes cma_release() non-blocking and simplifies
> > the code in hugetlbfs, where previously we had to temporarily drop
> > hugetlb_lock around the cma_release() call.
> > 
> > It should help Zi Yan on his work on 1 GB THPs: splitting a gigantic
> > THP under a memory pressure requires a cma_release() call. If it's
> > a blocking function, it complicates the already complicated code.
> > Because there are at least two use cases like this (hugetlbfs is
> > another example), I believe it's just better to make cma_release()
> > non-blocking.
> > 
> > It also makes it more consistent with other memory releasing functions
> > in the kernel: most of them are non-blocking.
> > 
> > 
> > Roman Gushchin (2):
> >    mm: cma: make cma_release() non-blocking
> >    mm: hugetlb: don't drop hugetlb_lock around cma_release() call
> > 
> >   mm/cma.c     | 51 +++++++++++++++++++++++++++++++++++++++++++++++++--
> >   mm/hugetlb.c |  6 ------
> >   2 files changed, 49 insertions(+), 8 deletions(-)
> > 
> I don't think this patch is a good idea.It transfers part or even all of the time of
> cma_release to cma_alloc, which is more concerned by performance indicators.

I'm not quite sure: if cma_alloc() is racing with cma_release(), cma_alloc() will
wait for the cma_lock mutex anyway. So we don't really transfer anything to cma_alloc().

> On Android phones, CPU resource competition is intense in many scenarios,
> As a result, kernel threads and workers can be scheduled only after some ticks or more.
> In this case, the performance of cma_alloc will deteriorate significantly,
> which is not good news for many services on Android.

Ok, I agree, if the cpu is heavily loaded, it might affect the total execution time.

If we aren't going into the mutex->spinlock conversion direction (as Mike suggested),
we can address the performance concerns by introducing a cma_release_nowait() function,
so that the default cma_release() would work in the old way.
cma_release_nowait() can set an atomic flag on a cma area, which will cause following
cma_alloc()'s to flush the release queue. In this case there will be no performance
penalty unless somebody is using cma_release_nowait().
Will it work for you?

Thank you!


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking
  2020-10-22  2:45   ` Roman Gushchin
@ 2020-10-22  3:47     ` Xiaqing (A)
  0 siblings, 0 replies; 11+ messages in thread
From: Xiaqing (A) @ 2020-10-22  3:47 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Andrew Morton, Zi Yan, Joonsoo Kim, Mike Kravetz, linux-kernel,
	linux-mm, kernel-team



On 2020/10/22 10:45, Roman Gushchin wrote:

> On Thu, Oct 22, 2020 at 09:54:53AM +0800, Xiaqing (A) wrote:
>>
>> On 2020/10/17 6:52, Roman Gushchin wrote:
>>
>>> This small patchset makes cma_release() non-blocking and simplifies
>>> the code in hugetlbfs, where previously we had to temporarily drop
>>> hugetlb_lock around the cma_release() call.
>>>
>>> It should help Zi Yan on his work on 1 GB THPs: splitting a gigantic
>>> THP under a memory pressure requires a cma_release() call. If it's
>>> a blocking function, it complicates the already complicated code.
>>> Because there are at least two use cases like this (hugetlbfs is
>>> another example), I believe it's just better to make cma_release()
>>> non-blocking.
>>>
>>> It also makes it more consistent with other memory releasing functions
>>> in the kernel: most of them are non-blocking.
>>>
>>>
>>> Roman Gushchin (2):
>>>     mm: cma: make cma_release() non-blocking
>>>     mm: hugetlb: don't drop hugetlb_lock around cma_release() call
>>>
>>>    mm/cma.c     | 51 +++++++++++++++++++++++++++++++++++++++++++++++++--
>>>    mm/hugetlb.c |  6 ------
>>>    2 files changed, 49 insertions(+), 8 deletions(-)
>>>
>> I don't think this patch is a good idea.It transfers part or even all of the time of
>> cma_release to cma_alloc, which is more concerned by performance indicators.
> I'm not quite sure: if cma_alloc() is racing with cma_release(), cma_alloc() will
> wait for the cma_lock mutex anyway. So we don't really transfer anything to cma_alloc().
>
>> On Android phones, CPU resource competition is intense in many scenarios,
>> As a result, kernel threads and workers can be scheduled only after some ticks or more.
>> In this case, the performance of cma_alloc will deteriorate significantly,
>> which is not good news for many services on Android.
> Ok, I agree, if the cpu is heavily loaded, it might affect the total execution time.
>
> If we aren't going into the mutex->spinlock conversion direction (as Mike suggested),
> we can address the performance concerns by introducing a cma_release_nowait() function,
> so that the default cma_release() would work in the old way.
> cma_release_nowait() can set an atomic flag on a cma area, which will cause following
> cma_alloc()'s to flush the release queue. In this case there will be no performance
> penalty unless somebody is using cma_release_nowait().
> Will it work for you?

That looks good to me.

Thanks!

>
> Thank you!
>
>




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking
  2020-10-22  2:33   ` Roman Gushchin
@ 2020-10-22 16:42     ` Mike Kravetz
  2020-10-22 17:16       ` Roman Gushchin
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Kravetz @ 2020-10-22 16:42 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Andrew Morton, Zi Yan, Joonsoo Kim, linux-kernel, linux-mm, kernel-team

On 10/21/20 7:33 PM, Roman Gushchin wrote:
> On Wed, Oct 21, 2020 at 05:15:53PM -0700, Mike Kravetz wrote:
>> On 10/16/20 3:52 PM, Roman Gushchin wrote:
>>> This small patchset makes cma_release() non-blocking and simplifies
>>> the code in hugetlbfs, where previously we had to temporarily drop
>>> hugetlb_lock around the cma_release() call.
>>>
>>> It should help Zi Yan on his work on 1 GB THPs: splitting a gigantic
>>> THP under a memory pressure requires a cma_release() call. If it's
>>> a blocking function, it complicates the already complicated code.
>>> Because there are at least two use cases like this (hugetlbfs is
>>> another example), I believe it's just better to make cma_release()
>>> non-blocking.
>>>
>>> It also makes it more consistent with other memory releasing functions
>>> in the kernel: most of them are non-blocking.
>>
>> Thanks for looking into this Roman.
> 
> Hi Mike,
> 
>>
>> I may be missing something, but why does cma_release have to be blocking
>> today?  Certainly, it takes the bitmap in cma_clear_bitmap and could
>> block.  However, I do not see why cma->lock has to be a mutex.  I may be
>> missing something, but I do not see any code protected by the mutex doing
>> anything that could sleep?
>>
>> Could we simply change that mutex to a spinlock?
> 
> I actually have suggested it few months ago, but the idea was
> opposed by Joonsoo: https://lkml.org/lkml/2020/4/3/12 .
> 
> The time of a bitmap operation is definitely not an issue in my context,
> but I can't speak for something like an embedded/rt case.
> 

I wonder if it may be time to look into replacing the cma area bitmap
with some other data structure?  Joonsoo was concerned about the time
required to traverse the bitmap for an 8GB area.  With new support for
allocating 1GB hugetlb pages from cma, I can imagine someone setting
up a cma area that is hundreds of GB if not TB in size.  It is going
take some time to traverse a bitmap describing such a huge area.

-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking
  2020-10-22 16:42     ` Mike Kravetz
@ 2020-10-22 17:16       ` Roman Gushchin
  2020-10-22 17:25         ` Mike Kravetz
  0 siblings, 1 reply; 11+ messages in thread
From: Roman Gushchin @ 2020-10-22 17:16 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: Andrew Morton, Zi Yan, Joonsoo Kim, linux-kernel, linux-mm, kernel-team

On Thu, Oct 22, 2020 at 09:42:11AM -0700, Mike Kravetz wrote:
> On 10/21/20 7:33 PM, Roman Gushchin wrote:
> > On Wed, Oct 21, 2020 at 05:15:53PM -0700, Mike Kravetz wrote:
> >> On 10/16/20 3:52 PM, Roman Gushchin wrote:
> >>> This small patchset makes cma_release() non-blocking and simplifies
> >>> the code in hugetlbfs, where previously we had to temporarily drop
> >>> hugetlb_lock around the cma_release() call.
> >>>
> >>> It should help Zi Yan on his work on 1 GB THPs: splitting a gigantic
> >>> THP under a memory pressure requires a cma_release() call. If it's
> >>> a blocking function, it complicates the already complicated code.
> >>> Because there are at least two use cases like this (hugetlbfs is
> >>> another example), I believe it's just better to make cma_release()
> >>> non-blocking.
> >>>
> >>> It also makes it more consistent with other memory releasing functions
> >>> in the kernel: most of them are non-blocking.
> >>
> >> Thanks for looking into this Roman.
> > 
> > Hi Mike,
> > 
> >>
> >> I may be missing something, but why does cma_release have to be blocking
> >> today?  Certainly, it takes the bitmap in cma_clear_bitmap and could
> >> block.  However, I do not see why cma->lock has to be a mutex.  I may be
> >> missing something, but I do not see any code protected by the mutex doing
> >> anything that could sleep?
> >>
> >> Could we simply change that mutex to a spinlock?
> > 
> > I actually have suggested it few months ago, but the idea was
> > opposed by Joonsoo: https://lkml.org/lkml/2020/4/3/12 .
> > 
> > The time of a bitmap operation is definitely not an issue in my context,
> > but I can't speak for something like an embedded/rt case.
> > 
> 
> I wonder if it may be time to look into replacing the cma area bitmap
> with some other data structure?  Joonsoo was concerned about the time
> required to traverse the bitmap for an 8GB area.  With new support for
> allocating 1GB hugetlb pages from cma, I can imagine someone setting
> up a cma area that is hundreds of GB if not TB in size.  It is going
> take some time to traverse a bitmap describing such a huge area.

If the cma area is used exclusively for 1 GB allocations, the bitmap can
have only 1 bit per GB, so it shouldn't be a big problem.

Long-term I have some hopes to be able to allocate 1 GB pages without
a need to reserve a cma area: we can try to group pages based on their mobility
on a 1 GB scale, so that all non-movable pages will reside in few 1 GB blocks.
I'm looking into that direction, but don't have any results yet.

If this idea fails and we'll end up allocating a large cma area unconditionally
and shrink it on demand (I think Rik suggested something like this),
replacing the bitmap with something else sounds like a good idea to me.

As now, I want to unblock Zi Yan on his work on 1 GB THPs, so maybe
we can go with introducing cma_release_nowait(), as I suggested in
the other e-mail in this thread? Do you have any objections?

Thanks!


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking
  2020-10-22 17:16       ` Roman Gushchin
@ 2020-10-22 17:25         ` Mike Kravetz
  0 siblings, 0 replies; 11+ messages in thread
From: Mike Kravetz @ 2020-10-22 17:25 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Andrew Morton, Zi Yan, Joonsoo Kim, linux-kernel, linux-mm, kernel-team

On 10/22/20 10:16 AM, Roman Gushchin wrote:
> As now, I want to unblock Zi Yan on his work on 1 GB THPs, so maybe
> we can go with introducing cma_release_nowait(), as I suggested in
> the other e-mail in this thread? Do you have any objections?

No objections from me.  And, I like that it could be used by the hugetlb
code to make it simpler.

-- 
Mike Kravetz


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-10-22 17:26 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-16 22:52 [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking Roman Gushchin
2020-10-16 22:52 ` [PATCH rfc 1/2] " Roman Gushchin
2020-10-16 22:52 ` [PATCH rfc 2/2] mm: hugetlb: don't drop hugetlb_lock around cma_release() call Roman Gushchin
2020-10-22  0:15 ` [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking Mike Kravetz
2020-10-22  2:33   ` Roman Gushchin
2020-10-22 16:42     ` Mike Kravetz
2020-10-22 17:16       ` Roman Gushchin
2020-10-22 17:25         ` Mike Kravetz
2020-10-22  1:54 ` Xiaqing (A)
2020-10-22  2:45   ` Roman Gushchin
2020-10-22  3:47     ` Xiaqing (A)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).