From: Dong Aisheng <aisheng.dong@nxp.com> To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, dongas86@gmail.com, shawnguo@kernel.org, linux-imx@nxp.com, akpm@linux-foundation.org, m.szyprowski@samsung.com, lecopzer.chen@mediatek.com, david@redhat.com, vbabka@suse.cz, stable@vger.kernel.org, shijie.qin@nxp.com, Dong Aisheng <aisheng.dong@nxp.com> Subject: [PATCH v3 1/2] mm: cma: fix allocation may fail sometimes Date: Tue, 15 Mar 2022 22:45:20 +0800 [thread overview] Message-ID: <20220315144521.3810298-2-aisheng.dong@nxp.com> (raw) In-Reply-To: <20220315144521.3810298-1-aisheng.dong@nxp.com> When there're multiple process allocing dma memory in parallel by calling dma_alloc_coherent(), it may fail sometimes as follows: Error log: cma: cma_alloc: linux,cma: alloc failed, req-size: 148 pages, ret: -16 cma: number of available pages: 3@125+20@172+12@236+4@380+32@736+17@2287+23@2473+20@36076+99@40477+108@40852+44@41108+20@41196+108@41364+108@41620+ 108@42900+108@43156+483@44061+1763@45341+1440@47712+20@49324+20@49388+5076@49452+2304@55040+35@58141+20@58220+20@58284+ 7188@58348+84@66220+7276@66452+227@74525+6371@75549=> 33161 free of 81920 total pages When issue happened, we saw there were still 33161 pages (129M) free CMA memory and a lot available free slots for 148 pages in CMA bitmap that we want to allocate. If dumping memory info, we found that there was also ~342M normal memory, but only 1352K CMA memory left in buddy system while a lot of pageblocks were isolated. Memory info log: Normal free:351096kB min:30000kB low:37500kB high:45000kB reserved_highatomic:0KB active_anon:98060kB inactive_anon:98948kB active_file:60864kB inactive_file:31776kB unevictable:0kB writepending:0kB present:1048576kB managed:1018328kB mlocked:0kB bounce:0kB free_pcp:220kB local_pcp:192kB free_cma:1352kB lowmem_reserve[]: 0 0 0 Normal: 78*4kB (UECI) 1772*8kB (UMECI) 1335*16kB (UMECI) 360*32kB (UMECI) 65*64kB (UMCI) 36*128kB (UMECI) 16*256kB (UMCI) 6*512kB (EI) 8*1024kB (UEI) 4*2048kB (MI) 8*4096kB (EI) 8*8192kB (UI) 3*16384kB (EI) 8*32768kB (M) = 489288kB The root cause of this issue is that since commit a4efc174b382 ("mm/cma.c: remove redundant cma_mutex lock"), CMA supports concurrent memory allocation. It's possible that the memory range process A trying to alloc has already been isolated by the allocation of process B during memory migration. The problem here is that the memory range isolated during one allocation by start_isolate_page_range() could be much bigger than the real size we want to alloc due to the range is aligned to MAX_ORDER_NR_PAGES. Taking an ARMv7 platform with 1G memory as an example, when MAX_ORDER_NR_PAGES is big (e.g. 32M with max_order 14) and CMA memory is relatively small (e.g. 128M), there're only 4 MAX_ORDER slot, then it's very easy that all CMA memory may have already been isolated by other processes when one trying to allocate memory using dma_alloc_coherent(). Since current CMA code will only scan one time of whole available CMA memory, then dma_alloc_coherent() may easy fail due to contention with other processes. This patch introduces a retry mechanism to rescan CMA bitmap for -EBUSY error in case the target memory range may has been temporarily isolated by others and released later. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Lecopzer Chen <lecopzer.chen@mediatek.com> Cc: David Hildenbrand <david@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> CC: stable@vger.kernel.org # 5.11+ Fixes: a4efc174b382 ("mm/cma.c: remove redundant cma_mutex lock") Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com> --- ChangeLog: * v2->v3: Improve commit messages * v1->v2: no changes --- mm/cma.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/mm/cma.c b/mm/cma.c index eaa4b5c920a2..46a9fd9f92c4 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -430,6 +430,7 @@ struct page *cma_alloc(struct cma *cma, unsigned long count, unsigned long i; struct page *page = NULL; int ret = -ENOMEM; + int loop = 0; if (!cma || !cma->count || !cma->bitmap) goto out; @@ -457,6 +458,16 @@ struct page *cma_alloc(struct cma *cma, unsigned long count, offset); if (bitmap_no >= bitmap_maxno) { spin_unlock_irq(&cma->lock); + pr_debug("%s(): alloc fail, retry loop %d\n", __func__, loop++); + /* + * rescan as others may finish the memory migration + * and quit if no available CMA memory found finally + */ + if (start) { + schedule(); + start = 0; + continue; + } break; } bitmap_set(cma->bitmap, bitmap_no, bitmap_count); -- 2.25.1
WARNING: multiple messages have this Message-ID (diff)
From: Dong Aisheng <aisheng.dong@nxp.com> To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, dongas86@gmail.com, shawnguo@kernel.org, linux-imx@nxp.com, akpm@linux-foundation.org, m.szyprowski@samsung.com, lecopzer.chen@mediatek.com, david@redhat.com, vbabka@suse.cz, stable@vger.kernel.org, shijie.qin@nxp.com, Dong Aisheng <aisheng.dong@nxp.com> Subject: [PATCH v3 1/2] mm: cma: fix allocation may fail sometimes Date: Tue, 15 Mar 2022 22:45:20 +0800 [thread overview] Message-ID: <20220315144521.3810298-2-aisheng.dong@nxp.com> (raw) In-Reply-To: <20220315144521.3810298-1-aisheng.dong@nxp.com> When there're multiple process allocing dma memory in parallel by calling dma_alloc_coherent(), it may fail sometimes as follows: Error log: cma: cma_alloc: linux,cma: alloc failed, req-size: 148 pages, ret: -16 cma: number of available pages: 3@125+20@172+12@236+4@380+32@736+17@2287+23@2473+20@36076+99@40477+108@40852+44@41108+20@41196+108@41364+108@41620+ 108@42900+108@43156+483@44061+1763@45341+1440@47712+20@49324+20@49388+5076@49452+2304@55040+35@58141+20@58220+20@58284+ 7188@58348+84@66220+7276@66452+227@74525+6371@75549=> 33161 free of 81920 total pages When issue happened, we saw there were still 33161 pages (129M) free CMA memory and a lot available free slots for 148 pages in CMA bitmap that we want to allocate. If dumping memory info, we found that there was also ~342M normal memory, but only 1352K CMA memory left in buddy system while a lot of pageblocks were isolated. Memory info log: Normal free:351096kB min:30000kB low:37500kB high:45000kB reserved_highatomic:0KB active_anon:98060kB inactive_anon:98948kB active_file:60864kB inactive_file:31776kB unevictable:0kB writepending:0kB present:1048576kB managed:1018328kB mlocked:0kB bounce:0kB free_pcp:220kB local_pcp:192kB free_cma:1352kB lowmem_reserve[]: 0 0 0 Normal: 78*4kB (UECI) 1772*8kB (UMECI) 1335*16kB (UMECI) 360*32kB (UMECI) 65*64kB (UMCI) 36*128kB (UMECI) 16*256kB (UMCI) 6*512kB (EI) 8*1024kB (UEI) 4*2048kB (MI) 8*4096kB (EI) 8*8192kB (UI) 3*16384kB (EI) 8*32768kB (M) = 489288kB The root cause of this issue is that since commit a4efc174b382 ("mm/cma.c: remove redundant cma_mutex lock"), CMA supports concurrent memory allocation. It's possible that the memory range process A trying to alloc has already been isolated by the allocation of process B during memory migration. The problem here is that the memory range isolated during one allocation by start_isolate_page_range() could be much bigger than the real size we want to alloc due to the range is aligned to MAX_ORDER_NR_PAGES. Taking an ARMv7 platform with 1G memory as an example, when MAX_ORDER_NR_PAGES is big (e.g. 32M with max_order 14) and CMA memory is relatively small (e.g. 128M), there're only 4 MAX_ORDER slot, then it's very easy that all CMA memory may have already been isolated by other processes when one trying to allocate memory using dma_alloc_coherent(). Since current CMA code will only scan one time of whole available CMA memory, then dma_alloc_coherent() may easy fail due to contention with other processes. This patch introduces a retry mechanism to rescan CMA bitmap for -EBUSY error in case the target memory range may has been temporarily isolated by others and released later. Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Lecopzer Chen <lecopzer.chen@mediatek.com> Cc: David Hildenbrand <david@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> CC: stable@vger.kernel.org # 5.11+ Fixes: a4efc174b382 ("mm/cma.c: remove redundant cma_mutex lock") Signed-off-by: Dong Aisheng <aisheng.dong@nxp.com> --- ChangeLog: * v2->v3: Improve commit messages * v1->v2: no changes --- mm/cma.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/mm/cma.c b/mm/cma.c index eaa4b5c920a2..46a9fd9f92c4 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -430,6 +430,7 @@ struct page *cma_alloc(struct cma *cma, unsigned long count, unsigned long i; struct page *page = NULL; int ret = -ENOMEM; + int loop = 0; if (!cma || !cma->count || !cma->bitmap) goto out; @@ -457,6 +458,16 @@ struct page *cma_alloc(struct cma *cma, unsigned long count, offset); if (bitmap_no >= bitmap_maxno) { spin_unlock_irq(&cma->lock); + pr_debug("%s(): alloc fail, retry loop %d\n", __func__, loop++); + /* + * rescan as others may finish the memory migration + * and quit if no available CMA memory found finally + */ + if (start) { + schedule(); + start = 0; + continue; + } break; } bitmap_set(cma->bitmap, bitmap_no, bitmap_count); -- 2.25.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2022-03-15 14:43 UTC|newest] Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-03-15 14:45 [PATCH v3 0/2] mm: fix cma allocation fail sometimes Dong Aisheng 2022-03-15 14:45 ` Dong Aisheng 2022-03-15 14:45 ` Dong Aisheng [this message] 2022-03-15 14:45 ` [PATCH v3 1/2] mm: cma: fix allocation may " Dong Aisheng 2022-03-15 22:58 ` Andrew Morton 2022-03-15 22:58 ` Andrew Morton 2022-03-16 3:41 ` Dong Aisheng 2022-03-16 3:41 ` Dong Aisheng 2022-03-16 21:09 ` Andrew Morton 2022-03-16 21:09 ` Andrew Morton 2022-03-17 3:49 ` Dong Aisheng 2022-03-17 3:49 ` Dong Aisheng 2022-03-17 10:55 ` David Hildenbrand 2022-03-17 10:55 ` David Hildenbrand 2022-03-17 14:26 ` Dong Aisheng 2022-03-17 14:26 ` Dong Aisheng 2022-03-17 17:12 ` Minchan Kim 2022-03-17 17:12 ` Minchan Kim 2022-03-18 3:43 ` Dong Aisheng 2022-03-18 3:43 ` Dong Aisheng 2022-03-18 16:20 ` Minchan Kim 2022-03-18 16:20 ` Minchan Kim 2022-05-04 15:52 ` Dong Aisheng 2022-05-04 15:52 ` Dong Aisheng 2022-05-04 23:25 ` Minchan Kim 2022-05-04 23:25 ` Minchan Kim 2022-03-15 14:45 ` [PATCH v3 2/2] mm: cma: try next MAX_ORDER_NR_PAGES during retry Dong Aisheng 2022-03-15 14:45 ` Dong Aisheng
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20220315144521.3810298-2-aisheng.dong@nxp.com \ --to=aisheng.dong@nxp.com \ --cc=akpm@linux-foundation.org \ --cc=david@redhat.com \ --cc=dongas86@gmail.com \ --cc=lecopzer.chen@mediatek.com \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-imx@nxp.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=m.szyprowski@samsung.com \ --cc=shawnguo@kernel.org \ --cc=shijie.qin@nxp.com \ --cc=stable@vger.kernel.org \ --cc=vbabka@suse.cz \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.