* [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking @ 2020-10-16 22:52 Roman Gushchin 2020-10-16 22:52 ` [PATCH rfc 1/2] " Roman Gushchin ` (3 more replies) 0 siblings, 4 replies; 11+ messages in thread From: Roman Gushchin @ 2020-10-16 22:52 UTC (permalink / raw) To: Andrew Morton Cc: Zi Yan, Joonsoo Kim, Mike Kravetz, linux-kernel, linux-mm, kernel-team, Roman Gushchin This small patchset makes cma_release() non-blocking and simplifies the code in hugetlbfs, where previously we had to temporarily drop hugetlb_lock around the cma_release() call. It should help Zi Yan on his work on 1 GB THPs: splitting a gigantic THP under a memory pressure requires a cma_release() call. If it's a blocking function, it complicates the already complicated code. Because there are at least two use cases like this (hugetlbfs is another example), I believe it's just better to make cma_release() non-blocking. It also makes it more consistent with other memory releasing functions in the kernel: most of them are non-blocking. Roman Gushchin (2): mm: cma: make cma_release() non-blocking mm: hugetlb: don't drop hugetlb_lock around cma_release() call mm/cma.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++-- mm/hugetlb.c | 6 ------ 2 files changed, 49 insertions(+), 8 deletions(-) -- 2.26.2 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH rfc 1/2] mm: cma: make cma_release() non-blocking 2020-10-16 22:52 [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking Roman Gushchin @ 2020-10-16 22:52 ` Roman Gushchin 2020-10-16 22:52 ` [PATCH rfc 2/2] mm: hugetlb: don't drop hugetlb_lock around cma_release() call Roman Gushchin ` (2 subsequent siblings) 3 siblings, 0 replies; 11+ messages in thread From: Roman Gushchin @ 2020-10-16 22:52 UTC (permalink / raw) To: Andrew Morton Cc: Zi Yan, Joonsoo Kim, Mike Kravetz, linux-kernel, linux-mm, kernel-team, Roman Gushchin cma_release() has to lock the cma_lock mutex to clear the cma bitmap. It makes it a blocking function, which complicates its usage from non-blocking contexts. For instance, hugetlbfs code is temporarily dropping the hugetlb_lock spinlock to call cma_release(). This patch makes cma_release non-blocking by postponing the cma bitmap clearance. It's done later from a work context. The first page in the cma allocation is used to store the work struct. To make sure that subsequent cma_alloc() call will pass, cma_alloc() flushes the corresponding workqueue. Because CMA allocations and de-allocations are usually not that frequent, a single global workqueue is used. Signed-off-by: Roman Gushchin <guro@fb.com> --- mm/cma.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 49 insertions(+), 2 deletions(-) diff --git a/mm/cma.c b/mm/cma.c index 7f415d7cda9f..523cd9356bc7 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -36,10 +36,19 @@ #include "cma.h" +struct cma_clear_bitmap_work { + struct work_struct work; + struct cma *cma; + unsigned long pfn; + unsigned int count; +}; + struct cma cma_areas[MAX_CMA_AREAS]; unsigned cma_area_count; static DEFINE_MUTEX(cma_mutex); +struct workqueue_struct *cma_release_wq; + phys_addr_t cma_get_base(const struct cma *cma) { return PFN_PHYS(cma->base_pfn); @@ -148,6 +157,10 @@ static int __init cma_init_reserved_areas(void) for (i = 0; i < cma_area_count; i++) cma_activate_area(&cma_areas[i]); + cma_release_wq = create_workqueue("cma_release"); + if (!cma_release_wq) + return -ENOMEM; + return 0; } core_initcall(cma_init_reserved_areas); @@ -437,6 +450,13 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, return NULL; for (;;) { + /* + * CMA bitmaps are cleared asynchronously from works, + * scheduled by cma_release(). To make sure the allocation + * will success, cma release workqueue is flushed here. + */ + flush_workqueue(cma_release_wq); + mutex_lock(&cma->lock); bitmap_no = bitmap_find_next_zero_area_off(cma->bitmap, bitmap_maxno, start, bitmap_count, mask, @@ -495,6 +515,17 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, return page; } +static void cma_clear_bitmap_fn(struct work_struct *work) +{ + struct cma_clear_bitmap_work *w; + + w = container_of(work, struct cma_clear_bitmap_work, work); + + cma_clear_bitmap(w->cma, w->pfn, w->count); + + __free_page(pfn_to_page(w->pfn)); +} + /** * cma_release() - release allocated pages * @cma: Contiguous memory region for which the allocation is performed. @@ -507,6 +538,7 @@ struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, */ bool cma_release(struct cma *cma, const struct page *pages, unsigned int count) { + struct cma_clear_bitmap_work *work; unsigned long pfn; if (!cma || !pages) @@ -521,8 +553,23 @@ bool cma_release(struct cma *cma, const struct page *pages, unsigned int count) VM_BUG_ON(pfn + count > cma->base_pfn + cma->count); - free_contig_range(pfn, count); - cma_clear_bitmap(cma, pfn, count); + /* + * To make cma_release() non-blocking, cma bitmap is cleared from + * a work context (see cma_clear_bitmap_fn()). The first page + * in the cma allocation is used to store the work structure, + * so it's released after the cma bitmap clearance. Other pages + * are released immediately as previously. + */ + if (count > 1) + free_contig_range(pfn + 1, count - 1); + + work = (struct cma_clear_bitmap_work *)page_to_virt(pages); + INIT_WORK(&work->work, cma_clear_bitmap_fn); + work->cma = cma; + work->pfn = pfn; + work->count = count; + queue_work(cma_release_wq, &work->work); + trace_cma_release(pfn, pages, count); return true; -- 2.26.2 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH rfc 2/2] mm: hugetlb: don't drop hugetlb_lock around cma_release() call 2020-10-16 22:52 [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking Roman Gushchin 2020-10-16 22:52 ` [PATCH rfc 1/2] " Roman Gushchin @ 2020-10-16 22:52 ` Roman Gushchin 2020-10-22 0:15 ` [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking Mike Kravetz 2020-10-22 1:54 ` Xiaqing (A) 3 siblings, 0 replies; 11+ messages in thread From: Roman Gushchin @ 2020-10-16 22:52 UTC (permalink / raw) To: Andrew Morton Cc: Zi Yan, Joonsoo Kim, Mike Kravetz, linux-kernel, linux-mm, kernel-team, Roman Gushchin cma_release() now is a non-blocking function, so there is no more need to drop hugetlb_lock to call it. Signed-off-by: Roman Gushchin <guro@fb.com> --- mm/hugetlb.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index fe76f8fd5a73..c8c892b1cabc 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1312,14 +1312,8 @@ static void update_and_free_page(struct hstate *h, struct page *page) set_compound_page_dtor(page, NULL_COMPOUND_DTOR); set_page_refcounted(page); if (hstate_is_gigantic(h)) { - /* - * Temporarily drop the hugetlb_lock, because - * we might block in free_gigantic_page(). - */ - spin_unlock(&hugetlb_lock); destroy_compound_gigantic_page(page, huge_page_order(h)); free_gigantic_page(page, huge_page_order(h)); - spin_lock(&hugetlb_lock); } else { __free_pages(page, huge_page_order(h)); } -- 2.26.2 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking 2020-10-16 22:52 [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking Roman Gushchin 2020-10-16 22:52 ` [PATCH rfc 1/2] " Roman Gushchin 2020-10-16 22:52 ` [PATCH rfc 2/2] mm: hugetlb: don't drop hugetlb_lock around cma_release() call Roman Gushchin @ 2020-10-22 0:15 ` Mike Kravetz 2020-10-22 2:33 ` Roman Gushchin 2020-10-22 1:54 ` Xiaqing (A) 3 siblings, 1 reply; 11+ messages in thread From: Mike Kravetz @ 2020-10-22 0:15 UTC (permalink / raw) To: Roman Gushchin, Andrew Morton Cc: Zi Yan, Joonsoo Kim, linux-kernel, linux-mm, kernel-team On 10/16/20 3:52 PM, Roman Gushchin wrote: > This small patchset makes cma_release() non-blocking and simplifies > the code in hugetlbfs, where previously we had to temporarily drop > hugetlb_lock around the cma_release() call. > > It should help Zi Yan on his work on 1 GB THPs: splitting a gigantic > THP under a memory pressure requires a cma_release() call. If it's > a blocking function, it complicates the already complicated code. > Because there are at least two use cases like this (hugetlbfs is > another example), I believe it's just better to make cma_release() > non-blocking. > > It also makes it more consistent with other memory releasing functions > in the kernel: most of them are non-blocking. Thanks for looking into this Roman. I may be missing something, but why does cma_release have to be blocking today? Certainly, it takes the bitmap in cma_clear_bitmap and could block. However, I do not see why cma->lock has to be a mutex. I may be missing something, but I do not see any code protected by the mutex doing anything that could sleep? Could we simply change that mutex to a spinlock? -- Mike Kravetz ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking 2020-10-22 0:15 ` [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking Mike Kravetz @ 2020-10-22 2:33 ` Roman Gushchin 2020-10-22 16:42 ` Mike Kravetz 0 siblings, 1 reply; 11+ messages in thread From: Roman Gushchin @ 2020-10-22 2:33 UTC (permalink / raw) To: Mike Kravetz Cc: Andrew Morton, Zi Yan, Joonsoo Kim, linux-kernel, linux-mm, kernel-team On Wed, Oct 21, 2020 at 05:15:53PM -0700, Mike Kravetz wrote: > On 10/16/20 3:52 PM, Roman Gushchin wrote: > > This small patchset makes cma_release() non-blocking and simplifies > > the code in hugetlbfs, where previously we had to temporarily drop > > hugetlb_lock around the cma_release() call. > > > > It should help Zi Yan on his work on 1 GB THPs: splitting a gigantic > > THP under a memory pressure requires a cma_release() call. If it's > > a blocking function, it complicates the already complicated code. > > Because there are at least two use cases like this (hugetlbfs is > > another example), I believe it's just better to make cma_release() > > non-blocking. > > > > It also makes it more consistent with other memory releasing functions > > in the kernel: most of them are non-blocking. > > Thanks for looking into this Roman. Hi Mike, > > I may be missing something, but why does cma_release have to be blocking > today? Certainly, it takes the bitmap in cma_clear_bitmap and could > block. However, I do not see why cma->lock has to be a mutex. I may be > missing something, but I do not see any code protected by the mutex doing > anything that could sleep? > > Could we simply change that mutex to a spinlock? I actually have suggested it few months ago, but the idea was opposed by Joonsoo: https://lkml.org/lkml/2020/4/3/12 . The time of a bitmap operation is definitely not an issue in my context, but I can't speak for something like an embedded/rt case. Thanks! ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking 2020-10-22 2:33 ` Roman Gushchin @ 2020-10-22 16:42 ` Mike Kravetz 2020-10-22 17:16 ` Roman Gushchin 0 siblings, 1 reply; 11+ messages in thread From: Mike Kravetz @ 2020-10-22 16:42 UTC (permalink / raw) To: Roman Gushchin Cc: Andrew Morton, Zi Yan, Joonsoo Kim, linux-kernel, linux-mm, kernel-team On 10/21/20 7:33 PM, Roman Gushchin wrote: > On Wed, Oct 21, 2020 at 05:15:53PM -0700, Mike Kravetz wrote: >> On 10/16/20 3:52 PM, Roman Gushchin wrote: >>> This small patchset makes cma_release() non-blocking and simplifies >>> the code in hugetlbfs, where previously we had to temporarily drop >>> hugetlb_lock around the cma_release() call. >>> >>> It should help Zi Yan on his work on 1 GB THPs: splitting a gigantic >>> THP under a memory pressure requires a cma_release() call. If it's >>> a blocking function, it complicates the already complicated code. >>> Because there are at least two use cases like this (hugetlbfs is >>> another example), I believe it's just better to make cma_release() >>> non-blocking. >>> >>> It also makes it more consistent with other memory releasing functions >>> in the kernel: most of them are non-blocking. >> >> Thanks for looking into this Roman. > > Hi Mike, > >> >> I may be missing something, but why does cma_release have to be blocking >> today? Certainly, it takes the bitmap in cma_clear_bitmap and could >> block. However, I do not see why cma->lock has to be a mutex. I may be >> missing something, but I do not see any code protected by the mutex doing >> anything that could sleep? >> >> Could we simply change that mutex to a spinlock? > > I actually have suggested it few months ago, but the idea was > opposed by Joonsoo: https://lkml.org/lkml/2020/4/3/12 . > > The time of a bitmap operation is definitely not an issue in my context, > but I can't speak for something like an embedded/rt case. > I wonder if it may be time to look into replacing the cma area bitmap with some other data structure? Joonsoo was concerned about the time required to traverse the bitmap for an 8GB area. With new support for allocating 1GB hugetlb pages from cma, I can imagine someone setting up a cma area that is hundreds of GB if not TB in size. It is going take some time to traverse a bitmap describing such a huge area. -- Mike Kravetz ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking 2020-10-22 16:42 ` Mike Kravetz @ 2020-10-22 17:16 ` Roman Gushchin 2020-10-22 17:25 ` Mike Kravetz 0 siblings, 1 reply; 11+ messages in thread From: Roman Gushchin @ 2020-10-22 17:16 UTC (permalink / raw) To: Mike Kravetz Cc: Andrew Morton, Zi Yan, Joonsoo Kim, linux-kernel, linux-mm, kernel-team On Thu, Oct 22, 2020 at 09:42:11AM -0700, Mike Kravetz wrote: > On 10/21/20 7:33 PM, Roman Gushchin wrote: > > On Wed, Oct 21, 2020 at 05:15:53PM -0700, Mike Kravetz wrote: > >> On 10/16/20 3:52 PM, Roman Gushchin wrote: > >>> This small patchset makes cma_release() non-blocking and simplifies > >>> the code in hugetlbfs, where previously we had to temporarily drop > >>> hugetlb_lock around the cma_release() call. > >>> > >>> It should help Zi Yan on his work on 1 GB THPs: splitting a gigantic > >>> THP under a memory pressure requires a cma_release() call. If it's > >>> a blocking function, it complicates the already complicated code. > >>> Because there are at least two use cases like this (hugetlbfs is > >>> another example), I believe it's just better to make cma_release() > >>> non-blocking. > >>> > >>> It also makes it more consistent with other memory releasing functions > >>> in the kernel: most of them are non-blocking. > >> > >> Thanks for looking into this Roman. > > > > Hi Mike, > > > >> > >> I may be missing something, but why does cma_release have to be blocking > >> today? Certainly, it takes the bitmap in cma_clear_bitmap and could > >> block. However, I do not see why cma->lock has to be a mutex. I may be > >> missing something, but I do not see any code protected by the mutex doing > >> anything that could sleep? > >> > >> Could we simply change that mutex to a spinlock? > > > > I actually have suggested it few months ago, but the idea was > > opposed by Joonsoo: https://lkml.org/lkml/2020/4/3/12 . > > > > The time of a bitmap operation is definitely not an issue in my context, > > but I can't speak for something like an embedded/rt case. > > > > I wonder if it may be time to look into replacing the cma area bitmap > with some other data structure? Joonsoo was concerned about the time > required to traverse the bitmap for an 8GB area. With new support for > allocating 1GB hugetlb pages from cma, I can imagine someone setting > up a cma area that is hundreds of GB if not TB in size. It is going > take some time to traverse a bitmap describing such a huge area. If the cma area is used exclusively for 1 GB allocations, the bitmap can have only 1 bit per GB, so it shouldn't be a big problem. Long-term I have some hopes to be able to allocate 1 GB pages without a need to reserve a cma area: we can try to group pages based on their mobility on a 1 GB scale, so that all non-movable pages will reside in few 1 GB blocks. I'm looking into that direction, but don't have any results yet. If this idea fails and we'll end up allocating a large cma area unconditionally and shrink it on demand (I think Rik suggested something like this), replacing the bitmap with something else sounds like a good idea to me. As now, I want to unblock Zi Yan on his work on 1 GB THPs, so maybe we can go with introducing cma_release_nowait(), as I suggested in the other e-mail in this thread? Do you have any objections? Thanks! ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking 2020-10-22 17:16 ` Roman Gushchin @ 2020-10-22 17:25 ` Mike Kravetz 0 siblings, 0 replies; 11+ messages in thread From: Mike Kravetz @ 2020-10-22 17:25 UTC (permalink / raw) To: Roman Gushchin Cc: Andrew Morton, Zi Yan, Joonsoo Kim, linux-kernel, linux-mm, kernel-team On 10/22/20 10:16 AM, Roman Gushchin wrote: > As now, I want to unblock Zi Yan on his work on 1 GB THPs, so maybe > we can go with introducing cma_release_nowait(), as I suggested in > the other e-mail in this thread? Do you have any objections? No objections from me. And, I like that it could be used by the hugetlb code to make it simpler. -- Mike Kravetz ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking 2020-10-16 22:52 [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking Roman Gushchin ` (2 preceding siblings ...) 2020-10-22 0:15 ` [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking Mike Kravetz @ 2020-10-22 1:54 ` Xiaqing (A) 2020-10-22 2:45 ` Roman Gushchin 3 siblings, 1 reply; 11+ messages in thread From: Xiaqing (A) @ 2020-10-22 1:54 UTC (permalink / raw) To: Roman Gushchin, Andrew Morton Cc: Zi Yan, Joonsoo Kim, Mike Kravetz, linux-kernel, linux-mm, kernel-team On 2020/10/17 6:52, Roman Gushchin wrote: > This small patchset makes cma_release() non-blocking and simplifies > the code in hugetlbfs, where previously we had to temporarily drop > hugetlb_lock around the cma_release() call. > > It should help Zi Yan on his work on 1 GB THPs: splitting a gigantic > THP under a memory pressure requires a cma_release() call. If it's > a blocking function, it complicates the already complicated code. > Because there are at least two use cases like this (hugetlbfs is > another example), I believe it's just better to make cma_release() > non-blocking. > > It also makes it more consistent with other memory releasing functions > in the kernel: most of them are non-blocking. > > > Roman Gushchin (2): > mm: cma: make cma_release() non-blocking > mm: hugetlb: don't drop hugetlb_lock around cma_release() call > > mm/cma.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++-- > mm/hugetlb.c | 6 ------ > 2 files changed, 49 insertions(+), 8 deletions(-) > I don't think this patch is a good idea.It transfers part or even all of the time of cma_release to cma_alloc, which is more concerned by performance indicators. On Android phones, CPU resource competition is intense in many scenarios, As a result, kernel threads and workers can be scheduled only after some ticks or more. In this case, the performance of cma_alloc will deteriorate significantly, which is not good news for many services on Android. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking 2020-10-22 1:54 ` Xiaqing (A) @ 2020-10-22 2:45 ` Roman Gushchin 2020-10-22 3:47 ` Xiaqing (A) 0 siblings, 1 reply; 11+ messages in thread From: Roman Gushchin @ 2020-10-22 2:45 UTC (permalink / raw) To: Xiaqing (A) Cc: Andrew Morton, Zi Yan, Joonsoo Kim, Mike Kravetz, linux-kernel, linux-mm, kernel-team On Thu, Oct 22, 2020 at 09:54:53AM +0800, Xiaqing (A) wrote: > > > On 2020/10/17 6:52, Roman Gushchin wrote: > > > This small patchset makes cma_release() non-blocking and simplifies > > the code in hugetlbfs, where previously we had to temporarily drop > > hugetlb_lock around the cma_release() call. > > > > It should help Zi Yan on his work on 1 GB THPs: splitting a gigantic > > THP under a memory pressure requires a cma_release() call. If it's > > a blocking function, it complicates the already complicated code. > > Because there are at least two use cases like this (hugetlbfs is > > another example), I believe it's just better to make cma_release() > > non-blocking. > > > > It also makes it more consistent with other memory releasing functions > > in the kernel: most of them are non-blocking. > > > > > > Roman Gushchin (2): > > mm: cma: make cma_release() non-blocking > > mm: hugetlb: don't drop hugetlb_lock around cma_release() call > > > > mm/cma.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++-- > > mm/hugetlb.c | 6 ------ > > 2 files changed, 49 insertions(+), 8 deletions(-) > > > I don't think this patch is a good idea.It transfers part or even all of the time of > cma_release to cma_alloc, which is more concerned by performance indicators. I'm not quite sure: if cma_alloc() is racing with cma_release(), cma_alloc() will wait for the cma_lock mutex anyway. So we don't really transfer anything to cma_alloc(). > On Android phones, CPU resource competition is intense in many scenarios, > As a result, kernel threads and workers can be scheduled only after some ticks or more. > In this case, the performance of cma_alloc will deteriorate significantly, > which is not good news for many services on Android. Ok, I agree, if the cpu is heavily loaded, it might affect the total execution time. If we aren't going into the mutex->spinlock conversion direction (as Mike suggested), we can address the performance concerns by introducing a cma_release_nowait() function, so that the default cma_release() would work in the old way. cma_release_nowait() can set an atomic flag on a cma area, which will cause following cma_alloc()'s to flush the release queue. In this case there will be no performance penalty unless somebody is using cma_release_nowait(). Will it work for you? Thank you! ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking 2020-10-22 2:45 ` Roman Gushchin @ 2020-10-22 3:47 ` Xiaqing (A) 0 siblings, 0 replies; 11+ messages in thread From: Xiaqing (A) @ 2020-10-22 3:47 UTC (permalink / raw) To: Roman Gushchin Cc: Andrew Morton, Zi Yan, Joonsoo Kim, Mike Kravetz, linux-kernel, linux-mm, kernel-team On 2020/10/22 10:45, Roman Gushchin wrote: > On Thu, Oct 22, 2020 at 09:54:53AM +0800, Xiaqing (A) wrote: >> >> On 2020/10/17 6:52, Roman Gushchin wrote: >> >>> This small patchset makes cma_release() non-blocking and simplifies >>> the code in hugetlbfs, where previously we had to temporarily drop >>> hugetlb_lock around the cma_release() call. >>> >>> It should help Zi Yan on his work on 1 GB THPs: splitting a gigantic >>> THP under a memory pressure requires a cma_release() call. If it's >>> a blocking function, it complicates the already complicated code. >>> Because there are at least two use cases like this (hugetlbfs is >>> another example), I believe it's just better to make cma_release() >>> non-blocking. >>> >>> It also makes it more consistent with other memory releasing functions >>> in the kernel: most of them are non-blocking. >>> >>> >>> Roman Gushchin (2): >>> mm: cma: make cma_release() non-blocking >>> mm: hugetlb: don't drop hugetlb_lock around cma_release() call >>> >>> mm/cma.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++-- >>> mm/hugetlb.c | 6 ------ >>> 2 files changed, 49 insertions(+), 8 deletions(-) >>> >> I don't think this patch is a good idea.It transfers part or even all of the time of >> cma_release to cma_alloc, which is more concerned by performance indicators. > I'm not quite sure: if cma_alloc() is racing with cma_release(), cma_alloc() will > wait for the cma_lock mutex anyway. So we don't really transfer anything to cma_alloc(). > >> On Android phones, CPU resource competition is intense in many scenarios, >> As a result, kernel threads and workers can be scheduled only after some ticks or more. >> In this case, the performance of cma_alloc will deteriorate significantly, >> which is not good news for many services on Android. > Ok, I agree, if the cpu is heavily loaded, it might affect the total execution time. > > If we aren't going into the mutex->spinlock conversion direction (as Mike suggested), > we can address the performance concerns by introducing a cma_release_nowait() function, > so that the default cma_release() would work in the old way. > cma_release_nowait() can set an atomic flag on a cma area, which will cause following > cma_alloc()'s to flush the release queue. In this case there will be no performance > penalty unless somebody is using cma_release_nowait(). > Will it work for you? That looks good to me. Thanks! > > Thank you! > > ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-10-22 17:26 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-10-16 22:52 [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking Roman Gushchin 2020-10-16 22:52 ` [PATCH rfc 1/2] " Roman Gushchin 2020-10-16 22:52 ` [PATCH rfc 2/2] mm: hugetlb: don't drop hugetlb_lock around cma_release() call Roman Gushchin 2020-10-22 0:15 ` [PATCH rfc 0/2] mm: cma: make cma_release() non-blocking Mike Kravetz 2020-10-22 2:33 ` Roman Gushchin 2020-10-22 16:42 ` Mike Kravetz 2020-10-22 17:16 ` Roman Gushchin 2020-10-22 17:25 ` Mike Kravetz 2020-10-22 1:54 ` Xiaqing (A) 2020-10-22 2:45 ` Roman Gushchin 2020-10-22 3:47 ` Xiaqing (A)
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).