linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/hugetlb: avoid hardcoding while checking if cma is reserved
@ 2020-07-06  8:44 Barry Song
  2020-07-06 21:48 ` Roman Gushchin
  0 siblings, 1 reply; 5+ messages in thread
From: Barry Song @ 2020-07-06  8:44 UTC (permalink / raw)
  To: akpm
  Cc: linux-mm, linux-kernel, linuxarm, Barry Song, Roman Gushchin,
	Mike Kravetz, Jonathan Cameron

hugetlb_cma[0] can be NULL due to various reasons, for example, node0 has
no memory. Thus, NULL hugetlb_cma[0] doesn't necessarily mean cma is not
enabled. gigantic pages might have been reserved on other nodes.

Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Cc: Roman Gushchin <guro@fb.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
---
 mm/hugetlb.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 57ece74e3aae..603aa854aa89 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2571,9 +2571,21 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
 
 	for (i = 0; i < h->max_huge_pages; ++i) {
 		if (hstate_is_gigantic(h)) {
-			if (IS_ENABLED(CONFIG_CMA) && hugetlb_cma[0]) {
-				pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n");
-				break;
+			if (IS_ENABLED(CONFIG_CMA)) {
+				int nid;
+				bool cma_reserved = false;
+
+				for_each_node_state(nid, N_ONLINE) {
+					if (hugetlb_cma[nid]) {
+						pr_warn_once("HugeTLB: hugetlb_cma is reserved,"
+								"skip boot time allocation\n");
+						cma_reserved = true;
+						break;
+					}
+				}
+
+				if (cma_reserved)
+					break;
 			}
 			if (!alloc_bootmem_huge_page(h))
 				break;
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm/hugetlb: avoid hardcoding while checking if cma is reserved
  2020-07-06  8:44 [PATCH] mm/hugetlb: avoid hardcoding while checking if cma is reserved Barry Song
@ 2020-07-06 21:48 ` Roman Gushchin
  2020-07-06 22:14   ` Song Bao Hua (Barry Song)
  2020-07-06 22:30   ` Song Bao Hua (Barry Song)
  0 siblings, 2 replies; 5+ messages in thread
From: Roman Gushchin @ 2020-07-06 21:48 UTC (permalink / raw)
  To: Barry Song
  Cc: akpm, linux-mm, linux-kernel, linuxarm, Mike Kravetz, Jonathan Cameron

On Mon, Jul 06, 2020 at 08:44:05PM +1200, Barry Song wrote:

Hello, Barry!

> hugetlb_cma[0] can be NULL due to various reasons, for example, node0 has
> no memory. Thus, NULL hugetlb_cma[0] doesn't necessarily mean cma is not
> enabled. gigantic pages might have been reserved on other nodes.

Just curious, is it a real-life problem you've seen? If so, I wonder how
you're using the hugetlb_cma option, and what's the outcome?

> 
> Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
> Cc: Roman Gushchin <guro@fb.com>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
> Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
> ---
>  mm/hugetlb.c | 18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 57ece74e3aae..603aa854aa89 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -2571,9 +2571,21 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
>  
>  	for (i = 0; i < h->max_huge_pages; ++i) {
>  		if (hstate_is_gigantic(h)) {
> -			if (IS_ENABLED(CONFIG_CMA) && hugetlb_cma[0]) {
> -				pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n");
> -				break;
> +			if (IS_ENABLED(CONFIG_CMA)) {
> +				int nid;
> +				bool cma_reserved = false;
> +
> +				for_each_node_state(nid, N_ONLINE) {
> +					if (hugetlb_cma[nid]) {
> +						pr_warn_once("HugeTLB: hugetlb_cma is reserved,"
> +								"skip boot time allocation\n");
> +						cma_reserved = true;
> +						break;
> +					}
> +				}
> +
> +				if (cma_reserved)
> +					break;

It's a valid problem, and I like to see it fixed. But I wonder if it would be better
to introduce a new helper bool hugetlb_cma_enabled()? And move both IS_ENABLED(CONFIG_CMA)
and hugetlb_cma[nid] checks there?

Thank you!

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH] mm/hugetlb: avoid hardcoding while checking if cma is reserved
  2020-07-06 21:48 ` Roman Gushchin
@ 2020-07-06 22:14   ` Song Bao Hua (Barry Song)
  2020-07-06 22:30   ` Song Bao Hua (Barry Song)
  1 sibling, 0 replies; 5+ messages in thread
From: Song Bao Hua (Barry Song) @ 2020-07-06 22:14 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: akpm, linux-mm, linux-kernel, Linuxarm, Mike Kravetz, Jonathan Cameron



> -----Original Message-----
> From: Roman Gushchin [mailto:guro@fb.com]
> Sent: Tuesday, July 7, 2020 9:48 AM
> To: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>
> Cc: akpm@linux-foundation.org; linux-mm@kvack.org;
> linux-kernel@vger.kernel.org; Linuxarm <linuxarm@huawei.com>; Mike
> Kravetz <mike.kravetz@oracle.com>; Jonathan Cameron
> <jonathan.cameron@huawei.com>
> Subject: Re: [PATCH] mm/hugetlb: avoid hardcoding while checking if cma is
> reserved
> 
> On Mon, Jul 06, 2020 at 08:44:05PM +1200, Barry Song wrote:
> 
> Hello, Barry!
> 
> > hugetlb_cma[0] can be NULL due to various reasons, for example, node0 has
> > no memory. Thus, NULL hugetlb_cma[0] doesn't necessarily mean cma is not
> > enabled. gigantic pages might have been reserved on other nodes.
> 
> Just curious, is it a real-life problem you've seen? If so, I wonder how
> you're using the hugetlb_cma option, and what's the outcome?

Yes. It is kind of stupid but I once got a board on which node0 has no DDR
though node1 and node3 have memory.

I actually prefer we get cma size of per node by:
cma size of one node = hugetlb_cma/ (nodes with memory)
rather than:
cma size of one node = hugetlb_cma/ (all online nodes)

but unfortunately, or the N_MEMORY infrastructures are not ready yet. I mean:

for_each_node_state(nid, N_MEMORY) {
		int res;

		size = min(per_node, hugetlb_cma_size - reserved);
		size = round_up(size, PAGE_SIZE << order);

		res = cma_declare_contiguous_nid(0, size, 0, PAGE_SIZE << order,
						 0, false, "hugetlb",
						 &hugetlb_cma[nid], nid);
		...
	}

> 
> >
> > Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages
> using cma")
> > Cc: Roman Gushchin <guro@fb.com>
> > Cc: Mike Kravetz <mike.kravetz@oracle.com>
> > Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
> > Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
> > ---
> >  mm/hugetlb.c | 18 +++++++++++++++---
> >  1 file changed, 15 insertions(+), 3 deletions(-)
> >
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index 57ece74e3aae..603aa854aa89 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -2571,9 +2571,21 @@ static void __init
> hugetlb_hstate_alloc_pages(struct hstate *h)
> >
> >  	for (i = 0; i < h->max_huge_pages; ++i) {
> >  		if (hstate_is_gigantic(h)) {
> > -			if (IS_ENABLED(CONFIG_CMA) && hugetlb_cma[0]) {
> > -				pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip
> boot time allocation\n");
> > -				break;
> > +			if (IS_ENABLED(CONFIG_CMA)) {
> > +				int nid;
> > +				bool cma_reserved = false;
> > +
> > +				for_each_node_state(nid, N_ONLINE) {
> > +					if (hugetlb_cma[nid]) {
> > +						pr_warn_once("HugeTLB: hugetlb_cma is
> reserved,"
> > +								"skip boot time allocation\n");
> > +						cma_reserved = true;
> > +						break;
> > +					}
> > +				}
> > +
> > +				if (cma_reserved)
> > +					break;
> 
> It's a valid problem, and I like to see it fixed. But I wonder if it would be better
> to introduce a new helper bool hugetlb_cma_enabled()? And move both
> IS_ENABLED(CONFIG_CMA)
> and hugetlb_cma[nid] checks there?

Yep. that would be more readable.

> 
> Thank you!

Thanks
Barry


^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH] mm/hugetlb: avoid hardcoding while checking if cma is reserved
  2020-07-06 21:48 ` Roman Gushchin
  2020-07-06 22:14   ` Song Bao Hua (Barry Song)
@ 2020-07-06 22:30   ` Song Bao Hua (Barry Song)
  2020-07-06 23:26     ` Roman Gushchin
  1 sibling, 1 reply; 5+ messages in thread
From: Song Bao Hua (Barry Song) @ 2020-07-06 22:30 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: akpm, linux-mm, linux-kernel, Linuxarm, Mike Kravetz, Jonathan Cameron



> -----Original Message-----
> From: Song Bao Hua (Barry Song)
> Sent: Tuesday, July 7, 2020 10:12 AM
> To: 'Roman Gushchin' <guro@fb.com>
> Cc: akpm@linux-foundation.org; linux-mm@kvack.org;
> linux-kernel@vger.kernel.org; Linuxarm <linuxarm@huawei.com>; Mike
> Kravetz <mike.kravetz@oracle.com>; Jonathan Cameron
> <jonathan.cameron@huawei.com>
> Subject: RE: [PATCH] mm/hugetlb: avoid hardcoding while checking if cma is
> reserved
> 
> 
> 
> > -----Original Message-----
> > From: Roman Gushchin [mailto:guro@fb.com]
> > Sent: Tuesday, July 7, 2020 9:48 AM
> > To: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>
> > Cc: akpm@linux-foundation.org; linux-mm@kvack.org;
> > linux-kernel@vger.kernel.org; Linuxarm <linuxarm@huawei.com>; Mike
> > Kravetz <mike.kravetz@oracle.com>; Jonathan Cameron
> > <jonathan.cameron@huawei.com>
> > Subject: Re: [PATCH] mm/hugetlb: avoid hardcoding while checking if
> > cma is reserved
> >
> > On Mon, Jul 06, 2020 at 08:44:05PM +1200, Barry Song wrote:
> >
> > Hello, Barry!
> >
> > > hugetlb_cma[0] can be NULL due to various reasons, for example,
> > > node0 has no memory. Thus, NULL hugetlb_cma[0] doesn't necessarily
> > > mean cma is not enabled. gigantic pages might have been reserved on
> other nodes.
> >
> > Just curious, is it a real-life problem you've seen? If so, I wonder
> > how you're using the hugetlb_cma option, and what's the outcome?
> 
> Yes. It is kind of stupid but I once got a board on which node0 has no DDR
> though node1 and node3 have memory.
> 
> I actually prefer we get cma size of per node by:
> cma size of one node = hugetlb_cma/ (nodes with memory) rather than:
> cma size of one node = hugetlb_cma/ (all online nodes)
> 
> but unfortunately, or the N_MEMORY infrastructures are not ready yet. I
> mean:
> 
> for_each_node_state(nid, N_MEMORY) {
> 		int res;
> 
> 		size = min(per_node, hugetlb_cma_size - reserved);
> 		size = round_up(size, PAGE_SIZE << order);
> 
> 		res = cma_declare_contiguous_nid(0, size, 0, PAGE_SIZE << order,
> 						 0, false, "hugetlb",
> 						 &hugetlb_cma[nid], nid);
> 		...
> 	}
> 

And for a server, there are many memory slots. The best config would be
making every node have at least one DDR. But it isn't necessarily true, it
is totally up to the users.

If we move hugetlb_cma_reserve() a bit later, we probably make hugetlb_cma size
completely consistent by splitting it to nodes with memory rather than nodes 
which are online:

void __init bootmem_init(void)
{
	...

	arm64_numa_init();

	/*
	 * must be done after arm64_numa_init() which calls numa_init() to
	 * initialize node_online_map that gets used in hugetlb_cma_reserve()
	 * while allocating required CMA size across online nodes.
	 */
- #ifdef CONFIG_ARM64_4K_PAGES
-	hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT);
- #endif

	...

	sparse_init();
	zone_sizes_init(min, max);

+ #ifdef CONFIG_ARM64_4K_PAGES
+	hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT);
+ #endif
	memblock_dump_all();
}

For x86, it could be done in similar way. Do you think it is worth to try?

> >
> > >
> > > Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic
> > > hugepages
> > using cma")
> > > Cc: Roman Gushchin <guro@fb.com>
> > > Cc: Mike Kravetz <mike.kravetz@oracle.com>
> > > Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
> > > Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
> > > ---
> > >  mm/hugetlb.c | 18 +++++++++++++++---
> > >  1 file changed, 15 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c index
> > > 57ece74e3aae..603aa854aa89 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -2571,9 +2571,21 @@ static void __init
> > hugetlb_hstate_alloc_pages(struct hstate *h)
> > >
> > >  	for (i = 0; i < h->max_huge_pages; ++i) {
> > >  		if (hstate_is_gigantic(h)) {
> > > -			if (IS_ENABLED(CONFIG_CMA) && hugetlb_cma[0]) {
> > > -				pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip
> > boot time allocation\n");
> > > -				break;
> > > +			if (IS_ENABLED(CONFIG_CMA)) {
> > > +				int nid;
> > > +				bool cma_reserved = false;
> > > +
> > > +				for_each_node_state(nid, N_ONLINE) {
> > > +					if (hugetlb_cma[nid]) {
> > > +						pr_warn_once("HugeTLB: hugetlb_cma is
> > reserved,"
> > > +								"skip boot time allocation\n");
> > > +						cma_reserved = true;
> > > +						break;
> > > +					}
> > > +				}
> > > +
> > > +				if (cma_reserved)
> > > +					break;
> >
> > It's a valid problem, and I like to see it fixed. But I wonder if it
> > would be better to introduce a new helper bool hugetlb_cma_enabled()?
> > And move both
> > IS_ENABLED(CONFIG_CMA)
> > and hugetlb_cma[nid] checks there?
> 
> Yep. that would be more readable.
> 
> >
> > Thank you!
> 
> Thanks
> Barry

Thanks
Barry


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm/hugetlb: avoid hardcoding while checking if cma is reserved
  2020-07-06 22:30   ` Song Bao Hua (Barry Song)
@ 2020-07-06 23:26     ` Roman Gushchin
  0 siblings, 0 replies; 5+ messages in thread
From: Roman Gushchin @ 2020-07-06 23:26 UTC (permalink / raw)
  To: Song Bao Hua (Barry Song)
  Cc: akpm, linux-mm, linux-kernel, Linuxarm, Mike Kravetz, Jonathan Cameron

On Mon, Jul 06, 2020 at 10:30:40PM +0000, Song Bao Hua (Barry Song) wrote:
> 
> 
> > -----Original Message-----
> > From: Song Bao Hua (Barry Song)
> > Sent: Tuesday, July 7, 2020 10:12 AM
> > To: 'Roman Gushchin' <guro@fb.com>
> > Cc: akpm@linux-foundation.org; linux-mm@kvack.org;
> > linux-kernel@vger.kernel.org; Linuxarm <linuxarm@huawei.com>; Mike
> > Kravetz <mike.kravetz@oracle.com>; Jonathan Cameron
> > <jonathan.cameron@huawei.com>
> > Subject: RE: [PATCH] mm/hugetlb: avoid hardcoding while checking if cma is
> > reserved
> > 
> > 
> > 
> > > -----Original Message-----
> > > From: Roman Gushchin [mailto:guro@fb.com]
> > > Sent: Tuesday, July 7, 2020 9:48 AM
> > > To: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>
> > > Cc: akpm@linux-foundation.org; linux-mm@kvack.org;
> > > linux-kernel@vger.kernel.org; Linuxarm <linuxarm@huawei.com>; Mike
> > > Kravetz <mike.kravetz@oracle.com>; Jonathan Cameron
> > > <jonathan.cameron@huawei.com>
> > > Subject: Re: [PATCH] mm/hugetlb: avoid hardcoding while checking if
> > > cma is reserved
> > >
> > > On Mon, Jul 06, 2020 at 08:44:05PM +1200, Barry Song wrote:
> > >
> > > Hello, Barry!
> > >
> > > > hugetlb_cma[0] can be NULL due to various reasons, for example,
> > > > node0 has no memory. Thus, NULL hugetlb_cma[0] doesn't necessarily
> > > > mean cma is not enabled. gigantic pages might have been reserved on
> > other nodes.
> > >
> > > Just curious, is it a real-life problem you've seen? If so, I wonder
> > > how you're using the hugetlb_cma option, and what's the outcome?
> > 
> > Yes. It is kind of stupid but I once got a board on which node0 has no DDR
> > though node1 and node3 have memory.
> > 
> > I actually prefer we get cma size of per node by:
> > cma size of one node = hugetlb_cma/ (nodes with memory) rather than:
> > cma size of one node = hugetlb_cma/ (all online nodes)
> > 
> > but unfortunately, or the N_MEMORY infrastructures are not ready yet. I
> > mean:
> > 
> > for_each_node_state(nid, N_MEMORY) {
> > 		int res;
> > 
> > 		size = min(per_node, hugetlb_cma_size - reserved);
> > 		size = round_up(size, PAGE_SIZE << order);
> > 
> > 		res = cma_declare_contiguous_nid(0, size, 0, PAGE_SIZE << order,
> > 						 0, false, "hugetlb",
> > 						 &hugetlb_cma[nid], nid);
> > 		...
> > 	}
> > 
> 
> And for a server, there are many memory slots. The best config would be
> making every node have at least one DDR. But it isn't necessarily true, it
> is totally up to the users.
> 
> If we move hugetlb_cma_reserve() a bit later, we probably make hugetlb_cma size
> completely consistent by splitting it to nodes with memory rather than nodes 
> which are online:
> 
> void __init bootmem_init(void)
> {
> 	...
> 
> 	arm64_numa_init();
> 
> 	/*
> 	 * must be done after arm64_numa_init() which calls numa_init() to
> 	 * initialize node_online_map that gets used in hugetlb_cma_reserve()
> 	 * while allocating required CMA size across online nodes.
> 	 */
> - #ifdef CONFIG_ARM64_4K_PAGES
> -	hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT);
> - #endif
> 
> 	...
> 
> 	sparse_init();
> 	zone_sizes_init(min, max);
> 
> + #ifdef CONFIG_ARM64_4K_PAGES
> +	hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT);
> + #endif
> 	memblock_dump_all();
> }
> 
> For x86, it could be done in similar way. Do you think it is worth to try?

It sounds like a good idea to me!

Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-07-06 23:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-06  8:44 [PATCH] mm/hugetlb: avoid hardcoding while checking if cma is reserved Barry Song
2020-07-06 21:48 ` Roman Gushchin
2020-07-06 22:14   ` Song Bao Hua (Barry Song)
2020-07-06 22:30   ` Song Bao Hua (Barry Song)
2020-07-06 23:26     ` Roman Gushchin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).