All of lore.kernel.org
 help / color / mirror / Atom feed
* Why __alloc_contig_migrate_range calls  migrate_prep() at first?
@ 2016-06-01  3:42 Wang Sheng-Hui
  2016-06-01  7:40 ` Minchan Kim
  0 siblings, 1 reply; 8+ messages in thread
From: Wang Sheng-Hui @ 2016-06-01  3:42 UTC (permalink / raw)
  To: akpm, mgorman, iamjoonsoo.kim; +Cc: linux-mm

Dear,

Sorry to trouble you.

I noticed cma_alloc would turn to  __alloc_contig_migrate_range for allocating pages.
But  __alloc_contig_migrate_range calls  migrate_prep() at first, even if the requested page
is single and free, lru_add_drain_all still run (called by  migrate_prep())?

Image a large chunk of free contig pages for CMA, various drivers may request a single page from
the CMA area, we'll get  lru_add_drain_all run for each page.

Should we detect if the required pages are free before migrate_prep(), or detect at least for single 
page allocation?

------------------
Regards,
Wang Sheng-HuiN‹§²æìr¸›zǧu©ž²Æ {\b­†éì¹»\x1c®&Þ–)îÆi¢žØ^n‡r¶‰šŽŠÝ¢j$½§$¢¸\x05¢¹¨­è§~Š'.)îÄÃ,yèm¶ŸÿÃ\f%Š{±šj+ƒðèž×¦j)Z†·Ÿ

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Why __alloc_contig_migrate_range calls  migrate_prep() at first?
  2016-06-01  3:42 Why __alloc_contig_migrate_range calls migrate_prep() at first? Wang Sheng-Hui
@ 2016-06-01  7:40 ` Minchan Kim
  2016-06-02  1:19   ` Wang Sheng-Hui
  0 siblings, 1 reply; 8+ messages in thread
From: Minchan Kim @ 2016-06-01  7:40 UTC (permalink / raw)
  To: Wang Sheng-Hui; +Cc: akpm, mgorman, iamjoonsoo.kim, linux-mm, Vlastimil Babka

On Wed, Jun 01, 2016 at 11:42:29AM +0800, Wang Sheng-Hui wrote:
> Dear,
> 
> Sorry to trouble you.
> 
> I noticed cma_alloc would turn to  __alloc_contig_migrate_range for allocating pages.
> But  __alloc_contig_migrate_range calls  migrate_prep() at first, even if the requested page
> is single and free, lru_add_drain_all still run (called by  migrate_prep())?
> 
> Image a large chunk of free contig pages for CMA, various drivers may request a single page from
> the CMA area, we'll get  lru_add_drain_all run for each page.
> 
> Should we detect if the required pages are free before migrate_prep(), or detect at least for single 
> page allocation?

That makes sense to me.

How about calling migrate_prep once migrate_pages fails in the first trial?

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 9d666df5ef95..c504c1a623d2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6623,8 +6623,6 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
 	unsigned int tries = 0;
 	int ret = 0;
 
-	migrate_prep();
-
 	while (pfn < end || !list_empty(&cc->migratepages)) {
 		if (fatal_signal_pending(current)) {
 			ret = -EINTR;
@@ -6650,6 +6648,8 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
 
 		ret = migrate_pages(&cc->migratepages, alloc_migrate_target,
 				    NULL, 0, cc->mode, MR_CMA);
+		if (ret)
+			migrate_prep();
 	}
 	if (ret < 0) {
 		putback_movable_pages(&cc->migratepages);


> 
> ------------------
> Regards,
> Wang Sheng-HuiN‹§²æìr¸›zǧu©ž²Æ {\b­†éì¹»\x1c®&Þ–)îÆi¢žØ^n‡r¶‰šŽŠÝ¢j$½§$¢¸\x05¢¹¨­è§~Š'.)îÄÃ,yèm¶ŸÿÃ\f%Š{±šj+ƒðèž×¦j)Z†·Ÿþf¢–Ú\x1d¢{d½§$¢¸\x1e™¨¥’öœ’Šà

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Why __alloc_contig_migrate_range calls migrate_prep() at first?
  2016-06-01  7:40 ` Minchan Kim
@ 2016-06-02  1:19   ` Wang Sheng-Hui
  2016-06-02  2:22     ` Joonsoo Kim
  0 siblings, 1 reply; 8+ messages in thread
From: Wang Sheng-Hui @ 2016-06-02  1:19 UTC (permalink / raw)
  To: Minchan Kim; +Cc: akpm, mgorman, iamjoonsoo.kim, linux-mm, Vlastimil Babka



On 6/1/2016 3:40 PM, Minchan Kim wrote:
> On Wed, Jun 01, 2016 at 11:42:29AM +0800, Wang Sheng-Hui wrote:
>> Dear,
>>
>> Sorry to trouble you.
>>
>> I noticed cma_alloc would turn to  __alloc_contig_migrate_range for allocating pages.
>> But  __alloc_contig_migrate_range calls  migrate_prep() at first, even if the requested page
>> is single and free, lru_add_drain_all still run (called by  migrate_prep())?
>>
>> Image a large chunk of free contig pages for CMA, various drivers may request a single page from
>> the CMA area, we'll get  lru_add_drain_all run for each page.
>>
>> Should we detect if the required pages are free before migrate_prep(), or detect at least for single 
>> page allocation?
> That makes sense to me.
>
> How about calling migrate_prep once migrate_pages fails in the first trial?

Minchan,

I tried your patch in my env, and the number of calling migrate_prep() dropped a lot.

In my case, CMA reserved 512MB, and the linux will call migrate_prep() 40~ times during bootup,
most are single page allocation request to CMA.
With your patch, migrate_prep() is not called for the single pages allocation requests as the free
pages in CMA area is enough.

Will you please push the patch to upstream?

Thanks,
Sheng-Hui

>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 9d666df5ef95..c504c1a623d2 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6623,8 +6623,6 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
>  	unsigned int tries = 0;
>  	int ret = 0;
>  
> -	migrate_prep();
> -
>  	while (pfn < end || !list_empty(&cc->migratepages)) {
>  		if (fatal_signal_pending(current)) {
>  			ret = -EINTR;
> @@ -6650,6 +6648,8 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
>  
>  		ret = migrate_pages(&cc->migratepages, alloc_migrate_target,
>  				    NULL, 0, cc->mode, MR_CMA);
> +		if (ret)
> +			migrate_prep();
>  	}
>  	if (ret < 0) {
>  		putback_movable_pages(&cc->migratepages);
>
>
>> ------------------
>> Regards,
>> Wang Sheng-HuiN??2aeir,?zC?u(C)?2AE {\b-?ei1>>\x1c(R)&TH?)iAEic?O^n?r?????Ycj$ 1/2 ?$c,\x05c1?-e?~?'.)iAA,yem??yA\f%?{+-?j+?de?x|j)Z?.?thfc?U\x1dc{d 1/2 ?$c,\x1e??JPY?o???a



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Why __alloc_contig_migrate_range calls migrate_prep() at first?
  2016-06-02  1:19   ` Wang Sheng-Hui
@ 2016-06-02  2:22     ` Joonsoo Kim
  2016-06-02  4:29       ` Minchan Kim
  0 siblings, 1 reply; 8+ messages in thread
From: Joonsoo Kim @ 2016-06-02  2:22 UTC (permalink / raw)
  To: Wang Sheng-Hui; +Cc: Minchan Kim, akpm, mgorman, linux-mm, Vlastimil Babka

On Thu, Jun 02, 2016 at 09:19:19AM +0800, Wang Sheng-Hui wrote:
> 
> 
> On 6/1/2016 3:40 PM, Minchan Kim wrote:
> > On Wed, Jun 01, 2016 at 11:42:29AM +0800, Wang Sheng-Hui wrote:
> >> Dear,
> >>
> >> Sorry to trouble you.
> >>
> >> I noticed cma_alloc would turn to  __alloc_contig_migrate_range for allocating pages.
> >> But  __alloc_contig_migrate_range calls  migrate_prep() at first, even if the requested page
> >> is single and free, lru_add_drain_all still run (called by  migrate_prep())?
> >>
> >> Image a large chunk of free contig pages for CMA, various drivers may request a single page from
> >> the CMA area, we'll get  lru_add_drain_all run for each page.
> >>
> >> Should we detect if the required pages are free before migrate_prep(), or detect at least for single 
> >> page allocation?
> > That makes sense to me.
> >
> > How about calling migrate_prep once migrate_pages fails in the first trial?
> 
> Minchan,
> 
> I tried your patch in my env, and the number of calling migrate_prep() dropped a lot.
> 
> In my case, CMA reserved 512MB, and the linux will call migrate_prep() 40~ times during bootup,
> most are single page allocation request to CMA.
> With your patch, migrate_prep() is not called for the single pages allocation requests as the free
> pages in CMA area is enough.
> 
> Will you please push the patch to upstream?

It is not correct.

migrate_prep() is called to move lru pages in lruvec to LRU. In
isolate_migratepages_range(), non LRU pages are just skipped so if
page is on the lruvec it will not be isolated and error isn't returned.
So, "if (ret) migrate_prep()" will not be called and we can't catch
the page in lruvec.

Anyway, better optimization for your case should be done in higher
level. See following patch. It removes useless pageblock isolation and migration
if possible. In fact, even we can do better than below by inroducing
alloc_contig_range() light mode that skip migrate_prep() and other high cost things
but it needs more surgery. I will revisit it soon.

Thanks.

----------->8---------------
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1a7f110..4af3665 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7314,6 +7314,51 @@ static unsigned long pfn_max_align_up(unsigned long pfn)
                                pageblock_nr_pages));
 }
 
+static int alloc_contig_range_fast(unsigned long start, unsigned long end, struct compact_control *cc)
+{
+       unsigned int order;
+       unsigned long outer_start, outer_end;
+       int ret = 0;
+
+       order = 0;
+       outer_start = start;
+       while (!PageBuddy(pfn_to_page(outer_start))) {
+               if (++order >= MAX_ORDER) {
+                       outer_start = start;
+                       break;
+               }
+               outer_start &= ~0UL << order;
+       }
+
+       if (outer_start != start) {
+               order = page_order(pfn_to_page(outer_start));
+
+               /*
+                * outer_start page could be small order buddy page and
+                * it doesn't include start page. Adjust outer_start
+                * in this case to report failed page properly
+                * on tracepoint in test_pages_isolated()
+                */
+               if (outer_start + (1UL << order) <= start)
+                       outer_start = start;
+       }
+
+       /* Grab isolated pages from freelists. */
+       outer_end = isolate_freepages_range(cc, outer_start, end);
+       if (!outer_end) {
+               ret = -EBUSY;
+               goto done;
+       }
+
+       if (start != outer_start)
+               free_contig_range(outer_start, start - outer_start);
+       if (end != outer_end)
+               free_contig_range(end, outer_end - end);
+
+done:
+       return ret;
+}
+
 /* [start, end) must belong to a single zone. */
 static int __alloc_contig_migrate_range(struct compact_control *cc,
                                        unsigned long start, unsigned long end)
@@ -7390,6 +7435,9 @@ int alloc_contig_range(unsigned long start, unsigned long end)
        };
        INIT_LIST_HEAD(&cc.migratepages);
 
+       if (!alloc_contig_range_fast(start, end, &cc))
+               return 0;
+
        /*
         * What we do here is we mark all pageblocks in range as
         * MIGRATE_ISOLATE.  Because pageblock and max order pages may


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Why __alloc_contig_migrate_range calls migrate_prep() at first?
  2016-06-02  2:22     ` Joonsoo Kim
@ 2016-06-02  4:29       ` Minchan Kim
  2016-06-02  6:29         ` Joonsoo Kim
  0 siblings, 1 reply; 8+ messages in thread
From: Minchan Kim @ 2016-06-02  4:29 UTC (permalink / raw)
  To: Joonsoo Kim; +Cc: Wang Sheng-Hui, akpm, mgorman, linux-mm, Vlastimil Babka

On Thu, Jun 02, 2016 at 11:22:43AM +0900, Joonsoo Kim wrote:
> On Thu, Jun 02, 2016 at 09:19:19AM +0800, Wang Sheng-Hui wrote:
> > 
> > 
> > On 6/1/2016 3:40 PM, Minchan Kim wrote:
> > > On Wed, Jun 01, 2016 at 11:42:29AM +0800, Wang Sheng-Hui wrote:
> > >> Dear,
> > >>
> > >> Sorry to trouble you.
> > >>
> > >> I noticed cma_alloc would turn to  __alloc_contig_migrate_range for allocating pages.
> > >> But  __alloc_contig_migrate_range calls  migrate_prep() at first, even if the requested page
> > >> is single and free, lru_add_drain_all still run (called by  migrate_prep())?
> > >>
> > >> Image a large chunk of free contig pages for CMA, various drivers may request a single page from
> > >> the CMA area, we'll get  lru_add_drain_all run for each page.
> > >>
> > >> Should we detect if the required pages are free before migrate_prep(), or detect at least for single 
> > >> page allocation?
> > > That makes sense to me.
> > >
> > > How about calling migrate_prep once migrate_pages fails in the first trial?
> > 
> > Minchan,
> > 
> > I tried your patch in my env, and the number of calling migrate_prep() dropped a lot.
> > 
> > In my case, CMA reserved 512MB, and the linux will call migrate_prep() 40~ times during bootup,
> > most are single page allocation request to CMA.
> > With your patch, migrate_prep() is not called for the single pages allocation requests as the free
> > pages in CMA area is enough.
> > 
> > Will you please push the patch to upstream?
> 
> It is not correct.
> 
> migrate_prep() is called to move lru pages in lruvec to LRU. In
> isolate_migratepages_range(), non LRU pages are just skipped so if
> page is on the lruvec it will not be isolated and error isn't returned.
> So, "if (ret) migrate_prep()" will not be called and we can't catch
> the page in lruvec.

Ah,, true. Thanks for correcting.

Simple fix is to remove migrate_prep in there and retry if test_pages_isolated
found migration is failed at least once.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7da8310b86e9..e0aa4a9b573d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7294,8 +7294,6 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
 	unsigned int tries = 0;
 	int ret = 0;
 
-	migrate_prep();
-
 	while (pfn < end || !list_empty(&cc->migratepages)) {
 		if (fatal_signal_pending(current)) {
 			ret = -EINTR;
@@ -7355,6 +7353,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 	unsigned long outer_start, outer_end;
 	unsigned int order;
 	int ret = 0;
+	bool lru_flushed = false;
 
 	struct compact_control cc = {
 		.nr_migratepages = 0,
@@ -7395,6 +7394,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 	if (ret)
 		return ret;
 
+again:
 	/*
 	 * In case of -EBUSY, we'd like to know which page causes problem.
 	 * So, just fall through. We will check it in test_pages_isolated().
@@ -7448,6 +7448,11 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 
 	/* Make sure the range is really isolated. */
 	if (test_pages_isolated(outer_start, end, false)) {
+		if (!lru_flushed) {
+			lru_flushed = true;
+			goto again;
+		}
+
 		pr_info("%s: [%lx, %lx) PFNs busy\n",
 			__func__, outer_start, end);
 		ret = -EBUSY;


> 
> Anyway, better optimization for your case should be done in higher
> level. See following patch. It removes useless pageblock isolation and migration
> if possible. In fact, even we can do better than below by inroducing
> alloc_contig_range() light mode that skip migrate_prep() and other high cost things
> but it needs more surgery. I will revisit it soon.

Yes, there are many rooms to be improved in cma_alloc and I remember
a few years ago, some guys(maybe, graphic) complained cma_alloc for small order page
is really slow so it would be really worth to do.

I'm looking forward to seeing that.

> 
> Thanks.
> 
> ----------->8---------------
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 1a7f110..4af3665 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7314,6 +7314,51 @@ static unsigned long pfn_max_align_up(unsigned long pfn)
>                                 pageblock_nr_pages));
>  }
>  
> +static int alloc_contig_range_fast(unsigned long start, unsigned long end, struct compact_control *cc)
> +{
> +       unsigned int order;
> +       unsigned long outer_start, outer_end;
> +       int ret = 0;
> +
> +       order = 0;
> +       outer_start = start;
> +       while (!PageBuddy(pfn_to_page(outer_start))) {
> +               if (++order >= MAX_ORDER) {
> +                       outer_start = start;
> +                       break;
> +               }
> +               outer_start &= ~0UL << order;
> +       }
> +
> +       if (outer_start != start) {
> +               order = page_order(pfn_to_page(outer_start));
> +
> +               /*
> +                * outer_start page could be small order buddy page and
> +                * it doesn't include start page. Adjust outer_start
> +                * in this case to report failed page properly
> +                * on tracepoint in test_pages_isolated()
> +                */
> +               if (outer_start + (1UL << order) <= start)
> +                       outer_start = start;
> +       }
> +
> +       /* Grab isolated pages from freelists. */
> +       outer_end = isolate_freepages_range(cc, outer_start, end);
> +       if (!outer_end) {
> +               ret = -EBUSY;
> +               goto done;
> +       }
> +
> +       if (start != outer_start)
> +               free_contig_range(outer_start, start - outer_start);
> +       if (end != outer_end)
> +               free_contig_range(end, outer_end - end);
> +
> +done:
> +       return ret;
> +}
> +
>  /* [start, end) must belong to a single zone. */
>  static int __alloc_contig_migrate_range(struct compact_control *cc,
>                                         unsigned long start, unsigned long end)
> @@ -7390,6 +7435,9 @@ int alloc_contig_range(unsigned long start, unsigned long end)
>         };
>         INIT_LIST_HEAD(&cc.migratepages);
>  
> +       if (!alloc_contig_range_fast(start, end, &cc))
> +               return 0;
> +
>         /*
>          * What we do here is we mark all pageblocks in range as
>          * MIGRATE_ISOLATE.  Because pageblock and max order pages may
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Why __alloc_contig_migrate_range calls migrate_prep() at first?
  2016-06-02  4:29       ` Minchan Kim
@ 2016-06-02  6:29         ` Joonsoo Kim
  2016-06-02  6:46           ` Minchan Kim
  0 siblings, 1 reply; 8+ messages in thread
From: Joonsoo Kim @ 2016-06-02  6:29 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Wang Sheng-Hui, akpm, mgorman, linux-mm, Vlastimil Babka

On Thu, Jun 02, 2016 at 01:29:16PM +0900, Minchan Kim wrote:
> On Thu, Jun 02, 2016 at 11:22:43AM +0900, Joonsoo Kim wrote:
> > On Thu, Jun 02, 2016 at 09:19:19AM +0800, Wang Sheng-Hui wrote:
> > > 
> > > 
> > > On 6/1/2016 3:40 PM, Minchan Kim wrote:
> > > > On Wed, Jun 01, 2016 at 11:42:29AM +0800, Wang Sheng-Hui wrote:
> > > >> Dear,
> > > >>
> > > >> Sorry to trouble you.
> > > >>
> > > >> I noticed cma_alloc would turn to  __alloc_contig_migrate_range for allocating pages.
> > > >> But  __alloc_contig_migrate_range calls  migrate_prep() at first, even if the requested page
> > > >> is single and free, lru_add_drain_all still run (called by  migrate_prep())?
> > > >>
> > > >> Image a large chunk of free contig pages for CMA, various drivers may request a single page from
> > > >> the CMA area, we'll get  lru_add_drain_all run for each page.
> > > >>
> > > >> Should we detect if the required pages are free before migrate_prep(), or detect at least for single 
> > > >> page allocation?
> > > > That makes sense to me.
> > > >
> > > > How about calling migrate_prep once migrate_pages fails in the first trial?
> > > 
> > > Minchan,
> > > 
> > > I tried your patch in my env, and the number of calling migrate_prep() dropped a lot.
> > > 
> > > In my case, CMA reserved 512MB, and the linux will call migrate_prep() 40~ times during bootup,
> > > most are single page allocation request to CMA.
> > > With your patch, migrate_prep() is not called for the single pages allocation requests as the free
> > > pages in CMA area is enough.
> > > 
> > > Will you please push the patch to upstream?
> > 
> > It is not correct.
> > 
> > migrate_prep() is called to move lru pages in lruvec to LRU. In
> > isolate_migratepages_range(), non LRU pages are just skipped so if
> > page is on the lruvec it will not be isolated and error isn't returned.
> > So, "if (ret) migrate_prep()" will not be called and we can't catch
> > the page in lruvec.
> 
> Ah,, true. Thanks for correcting.
> 
> Simple fix is to remove migrate_prep in there and retry if test_pages_isolated
> found migration is failed at least once.

Hmm...much better than before. But, it makes me wonder what his
painpoint is. He want to remove migrate_prep() which calls
lru_add_drain_all() needlessly. But, we already have one in alloc_contig_range().
So, he will not be happy entirely with following change.
Moreover, lru_add_drain_all() is there without any validation. It is there
since we need to gather migrated pages in lruvec but I think that it
is sufficient to call lru_add_drain_cpu() and drain_local_pages(), respectively.

> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 7da8310b86e9..e0aa4a9b573d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7294,8 +7294,6 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
>  	unsigned int tries = 0;
>  	int ret = 0;
>  
> -	migrate_prep();
> -
>  	while (pfn < end || !list_empty(&cc->migratepages)) {
>  		if (fatal_signal_pending(current)) {
>  			ret = -EINTR;
> @@ -7355,6 +7353,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
>  	unsigned long outer_start, outer_end;
>  	unsigned int order;
>  	int ret = 0;
> +	bool lru_flushed = false;
>  
>  	struct compact_control cc = {
>  		.nr_migratepages = 0,
> @@ -7395,6 +7394,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
>  	if (ret)
>  		return ret;
>  
> +again:
>  	/*
>  	 * In case of -EBUSY, we'd like to know which page causes problem.
>  	 * So, just fall through. We will check it in test_pages_isolated().
> @@ -7448,6 +7448,11 @@ int alloc_contig_range(unsigned long start, unsigned long end,
>  
>  	/* Make sure the range is really isolated. */
>  	if (test_pages_isolated(outer_start, end, false)) {
> +		if (!lru_flushed) {
> +			lru_flushed = true;
> +			goto again;
> +		}
> +
>  		pr_info("%s: [%lx, %lx) PFNs busy\n",
>  			__func__, outer_start, end);
>  		ret = -EBUSY;
> 
> 
> > 
> > Anyway, better optimization for your case should be done in higher
> > level. See following patch. It removes useless pageblock isolation and migration
> > if possible. In fact, even we can do better than below by inroducing
> > alloc_contig_range() light mode that skip migrate_prep() and other high cost things
> > but it needs more surgery. I will revisit it soon.
> 
> Yes, there are many rooms to be improved in cma_alloc and I remember
> a few years ago, some guys(maybe, graphic) complained cma_alloc for small order page
> is really slow so it would be really worth to do.
> 
> I'm looking forward to seeing that.

Okay. Will be back soon.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Why __alloc_contig_migrate_range calls migrate_prep() at first?
  2016-06-02  6:29         ` Joonsoo Kim
@ 2016-06-02  6:46           ` Minchan Kim
  0 siblings, 0 replies; 8+ messages in thread
From: Minchan Kim @ 2016-06-02  6:46 UTC (permalink / raw)
  To: Joonsoo Kim; +Cc: Wang Sheng-Hui, akpm, mgorman, linux-mm, Vlastimil Babka

On Thu, Jun 02, 2016 at 03:29:18PM +0900, Joonsoo Kim wrote:
> On Thu, Jun 02, 2016 at 01:29:16PM +0900, Minchan Kim wrote:
> > On Thu, Jun 02, 2016 at 11:22:43AM +0900, Joonsoo Kim wrote:
> > > On Thu, Jun 02, 2016 at 09:19:19AM +0800, Wang Sheng-Hui wrote:
> > > > 
> > > > 
> > > > On 6/1/2016 3:40 PM, Minchan Kim wrote:
> > > > > On Wed, Jun 01, 2016 at 11:42:29AM +0800, Wang Sheng-Hui wrote:
> > > > >> Dear,
> > > > >>
> > > > >> Sorry to trouble you.
> > > > >>
> > > > >> I noticed cma_alloc would turn to  __alloc_contig_migrate_range for allocating pages.
> > > > >> But  __alloc_contig_migrate_range calls  migrate_prep() at first, even if the requested page
> > > > >> is single and free, lru_add_drain_all still run (called by  migrate_prep())?
> > > > >>
> > > > >> Image a large chunk of free contig pages for CMA, various drivers may request a single page from
> > > > >> the CMA area, we'll get  lru_add_drain_all run for each page.
> > > > >>
> > > > >> Should we detect if the required pages are free before migrate_prep(), or detect at least for single 
> > > > >> page allocation?
> > > > > That makes sense to me.
> > > > >
> > > > > How about calling migrate_prep once migrate_pages fails in the first trial?
> > > > 
> > > > Minchan,
> > > > 
> > > > I tried your patch in my env, and the number of calling migrate_prep() dropped a lot.
> > > > 
> > > > In my case, CMA reserved 512MB, and the linux will call migrate_prep() 40~ times during bootup,
> > > > most are single page allocation request to CMA.
> > > > With your patch, migrate_prep() is not called for the single pages allocation requests as the free
> > > > pages in CMA area is enough.
> > > > 
> > > > Will you please push the patch to upstream?
> > > 
> > > It is not correct.
> > > 
> > > migrate_prep() is called to move lru pages in lruvec to LRU. In
> > > isolate_migratepages_range(), non LRU pages are just skipped so if
> > > page is on the lruvec it will not be isolated and error isn't returned.
> > > So, "if (ret) migrate_prep()" will not be called and we can't catch
> > > the page in lruvec.
> > 
> > Ah,, true. Thanks for correcting.
> > 
> > Simple fix is to remove migrate_prep in there and retry if test_pages_isolated
> > found migration is failed at least once.
> 
> Hmm...much better than before. But, it makes me wonder what his
> painpoint is. He want to remove migrate_prep() which calls
> lru_add_drain_all() needlessly. But, we already have one in alloc_contig_range().
> So, he will not be happy entirely with following change.

It will reduce the numer of drain IPI call so it would be win compared to old.

> Moreover, lru_add_drain_all() is there without any validation. It is there
> since we need to gather migrated pages in lruvec but I think that it
> is sufficient to call lru_add_drain_cpu() and drain_local_pages(), respectively.

Maybe, it is heritage of memory hotplugging. :(
Should be more optimized. Hope you will look into that.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re:  Why __alloc_contig_migrate_range calls  migrate_prep() at first?
@ 2016-06-01 12:11 Wang Sheng-Hui
  0 siblings, 0 replies; 8+ messages in thread
From: Wang Sheng-Hui @ 2016-06-01 12:11 UTC (permalink / raw)
  To: Minchan Kim; +Cc: akpm, mgorman, iamjoonsoo.kim, linux-mm, Vlastimil Babka

[-- Attachment #1: Type: text/plain, Size: 2148 bytes --]

Minchan,

That sounds good to me. 

Thanks,
Wang Sheng-Hui


------------------ Original ------------------
From:  "Minchan Kim";<minchan@kernel.org>;
Date:  Wed, Jun 1, 2016 03:40 PM
To:  "Wang Sheng-Hui"<shhuiw@foxmail.com>; 
Cc:  "akpm"<akpm@linux-foundation.org>; "mgorman"<mgorman@techsingularity.net>; "iamjoonsoo.kim"<iamjoonsoo.kim@lge.com>; "linux-mm"<linux-mm@kvack.org>; "Vlastimil Babka"<vbabka@suse.cz>; 
Subject:  Re: Why __alloc_contig_migrate_range calls  migrate_prep() at first?



On Wed, Jun 01, 2016 at 11:42:29AM +0800, Wang Sheng-Hui wrote:
> Dear,
> 
> Sorry to trouble you.
> 
> I noticed cma_alloc would turn to  __alloc_contig_migrate_range for allocating pages.
> But  __alloc_contig_migrate_range calls  migrate_prep() at first, even if the requested page
> is single and free, lru_add_drain_all still run (called by  migrate_prep())?
> 
> Image a large chunk of free contig pages for CMA, various drivers may request a single page from
> the CMA area, we'll get  lru_add_drain_all run for each page.
> 
> Should we detect if the required pages are free before migrate_prep(), or detect at least for single 
> page allocation?

That makes sense to me.

How about calling migrate_prep once migrate_pages fails in the first trial?

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 9d666df5ef95..c504c1a623d2 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6623,8 +6623,6 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
unsigned int tries = 0;
int ret = 0;

- migrate_prep();
-
while (pfn < end || !list_empty(&cc->migratepages)) {
if (fatal_signal_pending(current)) {
ret = -EINTR;
@@ -6650,6 +6648,8 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,

ret = migrate_pages(&cc->migratepages, alloc_migrate_target,
 NULL, 0, cc->mode, MR_CMA);
+if (ret)
+ migrate_prep();
}
if (ret < 0) {
putback_movable_pages(&cc->migratepages);


> 
> ------------------
> Regards,
> Wang Sheng-HuiN‹§²æìr¸›zǧu©ž²Æ {­†éì¹»®&Þ–)îÆi¢žØ^n‡r¶‰šŽŠÝ¢j$½§$¢¸¢¹¨­è§~Š'.)îÄ

[-- Attachment #2: Type: text/html, Size: 2456 bytes --]

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2016-06-02  6:46 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-01  3:42 Why __alloc_contig_migrate_range calls migrate_prep() at first? Wang Sheng-Hui
2016-06-01  7:40 ` Minchan Kim
2016-06-02  1:19   ` Wang Sheng-Hui
2016-06-02  2:22     ` Joonsoo Kim
2016-06-02  4:29       ` Minchan Kim
2016-06-02  6:29         ` Joonsoo Kim
2016-06-02  6:46           ` Minchan Kim
2016-06-01 12:11 Wang Sheng-Hui

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.