All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Wang Sheng-Hui <shhuiw@foxmail.com>,
	akpm <akpm@linux-foundation.org>,
	mgorman <mgorman@techsingularity.net>,
	linux-mm <linux-mm@kvack.org>, Vlastimil Babka <vbabka@suse.cz>
Subject: Re: Why __alloc_contig_migrate_range calls migrate_prep() at first?
Date: Thu, 2 Jun 2016 13:29:16 +0900	[thread overview]
Message-ID: <20160602042916.GB3024@bbox> (raw)
In-Reply-To: <20160602022242.GB9133@js1304-P5Q-DELUXE>

On Thu, Jun 02, 2016 at 11:22:43AM +0900, Joonsoo Kim wrote:
> On Thu, Jun 02, 2016 at 09:19:19AM +0800, Wang Sheng-Hui wrote:
> > 
> > 
> > On 6/1/2016 3:40 PM, Minchan Kim wrote:
> > > On Wed, Jun 01, 2016 at 11:42:29AM +0800, Wang Sheng-Hui wrote:
> > >> Dear,
> > >>
> > >> Sorry to trouble you.
> > >>
> > >> I noticed cma_alloc would turn to  __alloc_contig_migrate_range for allocating pages.
> > >> But  __alloc_contig_migrate_range calls  migrate_prep() at first, even if the requested page
> > >> is single and free, lru_add_drain_all still run (called by  migrate_prep())?
> > >>
> > >> Image a large chunk of free contig pages for CMA, various drivers may request a single page from
> > >> the CMA area, we'll get  lru_add_drain_all run for each page.
> > >>
> > >> Should we detect if the required pages are free before migrate_prep(), or detect at least for single 
> > >> page allocation?
> > > That makes sense to me.
> > >
> > > How about calling migrate_prep once migrate_pages fails in the first trial?
> > 
> > Minchan,
> > 
> > I tried your patch in my env, and the number of calling migrate_prep() dropped a lot.
> > 
> > In my case, CMA reserved 512MB, and the linux will call migrate_prep() 40~ times during bootup,
> > most are single page allocation request to CMA.
> > With your patch, migrate_prep() is not called for the single pages allocation requests as the free
> > pages in CMA area is enough.
> > 
> > Will you please push the patch to upstream?
> 
> It is not correct.
> 
> migrate_prep() is called to move lru pages in lruvec to LRU. In
> isolate_migratepages_range(), non LRU pages are just skipped so if
> page is on the lruvec it will not be isolated and error isn't returned.
> So, "if (ret) migrate_prep()" will not be called and we can't catch
> the page in lruvec.

Ah,, true. Thanks for correcting.

Simple fix is to remove migrate_prep in there and retry if test_pages_isolated
found migration is failed at least once.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7da8310b86e9..e0aa4a9b573d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7294,8 +7294,6 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
 	unsigned int tries = 0;
 	int ret = 0;
 
-	migrate_prep();
-
 	while (pfn < end || !list_empty(&cc->migratepages)) {
 		if (fatal_signal_pending(current)) {
 			ret = -EINTR;
@@ -7355,6 +7353,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 	unsigned long outer_start, outer_end;
 	unsigned int order;
 	int ret = 0;
+	bool lru_flushed = false;
 
 	struct compact_control cc = {
 		.nr_migratepages = 0,
@@ -7395,6 +7394,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 	if (ret)
 		return ret;
 
+again:
 	/*
 	 * In case of -EBUSY, we'd like to know which page causes problem.
 	 * So, just fall through. We will check it in test_pages_isolated().
@@ -7448,6 +7448,11 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 
 	/* Make sure the range is really isolated. */
 	if (test_pages_isolated(outer_start, end, false)) {
+		if (!lru_flushed) {
+			lru_flushed = true;
+			goto again;
+		}
+
 		pr_info("%s: [%lx, %lx) PFNs busy\n",
 			__func__, outer_start, end);
 		ret = -EBUSY;


> 
> Anyway, better optimization for your case should be done in higher
> level. See following patch. It removes useless pageblock isolation and migration
> if possible. In fact, even we can do better than below by inroducing
> alloc_contig_range() light mode that skip migrate_prep() and other high cost things
> but it needs more surgery. I will revisit it soon.

Yes, there are many rooms to be improved in cma_alloc and I remember
a few years ago, some guys(maybe, graphic) complained cma_alloc for small order page
is really slow so it would be really worth to do.

I'm looking forward to seeing that.

> 
> Thanks.
> 
> ----------->8---------------
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 1a7f110..4af3665 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7314,6 +7314,51 @@ static unsigned long pfn_max_align_up(unsigned long pfn)
>                                 pageblock_nr_pages));
>  }
>  
> +static int alloc_contig_range_fast(unsigned long start, unsigned long end, struct compact_control *cc)
> +{
> +       unsigned int order;
> +       unsigned long outer_start, outer_end;
> +       int ret = 0;
> +
> +       order = 0;
> +       outer_start = start;
> +       while (!PageBuddy(pfn_to_page(outer_start))) {
> +               if (++order >= MAX_ORDER) {
> +                       outer_start = start;
> +                       break;
> +               }
> +               outer_start &= ~0UL << order;
> +       }
> +
> +       if (outer_start != start) {
> +               order = page_order(pfn_to_page(outer_start));
> +
> +               /*
> +                * outer_start page could be small order buddy page and
> +                * it doesn't include start page. Adjust outer_start
> +                * in this case to report failed page properly
> +                * on tracepoint in test_pages_isolated()
> +                */
> +               if (outer_start + (1UL << order) <= start)
> +                       outer_start = start;
> +       }
> +
> +       /* Grab isolated pages from freelists. */
> +       outer_end = isolate_freepages_range(cc, outer_start, end);
> +       if (!outer_end) {
> +               ret = -EBUSY;
> +               goto done;
> +       }
> +
> +       if (start != outer_start)
> +               free_contig_range(outer_start, start - outer_start);
> +       if (end != outer_end)
> +               free_contig_range(end, outer_end - end);
> +
> +done:
> +       return ret;
> +}
> +
>  /* [start, end) must belong to a single zone. */
>  static int __alloc_contig_migrate_range(struct compact_control *cc,
>                                         unsigned long start, unsigned long end)
> @@ -7390,6 +7435,9 @@ int alloc_contig_range(unsigned long start, unsigned long end)
>         };
>         INIT_LIST_HEAD(&cc.migratepages);
>  
> +       if (!alloc_contig_range_fast(start, end, &cc))
> +               return 0;
> +
>         /*
>          * What we do here is we mark all pageblocks in range as
>          * MIGRATE_ISOLATE.  Because pageblock and max order pages may
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-06-02  4:29 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-01  3:42 Why __alloc_contig_migrate_range calls migrate_prep() at first? Wang Sheng-Hui
2016-06-01  7:40 ` Minchan Kim
2016-06-02  1:19   ` Wang Sheng-Hui
2016-06-02  2:22     ` Joonsoo Kim
2016-06-02  4:29       ` Minchan Kim [this message]
2016-06-02  6:29         ` Joonsoo Kim
2016-06-02  6:46           ` Minchan Kim
2016-06-01 12:11 Wang Sheng-Hui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160602042916.GB3024@bbox \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=shhuiw@foxmail.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.