linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Liang, Kan" <kan.liang@intel.com>
To: Mel Gorman <mgorman@techsingularity.net>,
	Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@elte.hu>, Andi Kleen <ak@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>, Jan Kara <jack@suse.cz>,
	linux-mm <linux-mm@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH 1/2] sched/wait: Break up long wake list walk
Date: Mon, 21 Aug 2017 18:56:20 +0000	[thread overview]
Message-ID: <37D7C6CF3E00A74B8858931C1DB2F07753788B58@SHSMSX103.ccr.corp.intel.com> (raw)
In-Reply-To: <20170821183234.kzennaaw2zt2rbwz@techsingularity.net>

> > Because that code sequence doesn't actually depend on
> > "wait_on_page_lock()" for _correctness_ anyway, afaik. Anybody who
> > does "migration_entry_wait()" _has_ to retry anyway, since the page
> > table contents may have changed by waiting.
> >
> > So I'm not proud of the attached patch, and I don't think it's really
> > acceptable as-is, but maybe it's worth testing? And maybe it's
> > arguably no worse than what we have now?
> >
> > Comments?
> >
> 
> The transhuge migration path for numa balancing doesn't go through the
> migration_entry_wait patch despite similarly named functions that suggest
> it does so this may only has the most effect when THP is disabled. It's
> worth trying anyway.

I just finished the test of yield patch (only functionality not performance). 
Yes, it works well with THP disabled.
With THP enabled, I observed one LOCKUP caused by long queue wait.

Here is the call stack with THP enabled. 
#
   100.00%  (ffffffff9e1aefca)
            |
            ---wait_on_page_bit
               do_huge_pmd_numa_page
               __handle_mm_fault
               handle_mm_fault
               __do_page_fault
               do_page_fault
               page_fault
               |
               |--60.39%--0x2b7b7
               |          |
               |          |--34.26%--0x127d8
               |          |          start_thread
               |          |
               |           --25.95%--0x127a2
               |                     start_thread
               |
                --39.25%--0x2b788
                          |
                           --38.81%--0x127a2
                                     start_thread


> 
> Covering both paths would be something like the patch below which spins
> until the page is unlocked or it should reschedule. It's not even boot
> tested as I spent what time I had on the test case that I hoped would be
> able to prove it really works.

I will give it a try.

Thanks,
Kan

> 
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index 79b36f57c3ba..31cda1288176 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -517,6 +517,13 @@ static inline void wait_on_page_locked(struct page
> *page)
>  		wait_on_page_bit(compound_head(page), PG_locked);
>  }
> 
> +void __spinwait_on_page_locked(struct page *page);
> +static inline void spinwait_on_page_locked(struct page *page)
> +{
> +	if (PageLocked(page))
> +		__spinwait_on_page_locked(page);
> +}
> +
>  static inline int wait_on_page_locked_killable(struct page *page)
>  {
>  	if (!PageLocked(page))
> diff --git a/mm/filemap.c b/mm/filemap.c
> index a49702445ce0..c9d6f49614bc 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -1210,6 +1210,15 @@ int __lock_page_or_retry(struct page *page,
> struct mm_struct *mm,
>  	}
>  }
> 
> +void __spinwait_on_page_locked(struct page *page)
> +{
> +	do {
> +		cpu_relax();
> +	} while (PageLocked(page) && !cond_resched());
> +
> +	wait_on_page_locked(page);
> +}
> +
>  /**
>   * page_cache_next_hole - find the next hole (not-present entry)
>   * @mapping: mapping
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 90731e3b7e58..c7025c806420 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1443,7 +1443,7 @@ int do_huge_pmd_numa_page(struct vm_fault
> *vmf, pmd_t pmd)
>  		if (!get_page_unless_zero(page))
>  			goto out_unlock;
>  		spin_unlock(vmf->ptl);
> -		wait_on_page_locked(page);
> +		spinwait_on_page_locked(page);
>  		put_page(page);
>  		goto out;
>  	}
> @@ -1480,7 +1480,7 @@ int do_huge_pmd_numa_page(struct vm_fault
> *vmf, pmd_t pmd)
>  		if (!get_page_unless_zero(page))
>  			goto out_unlock;
>  		spin_unlock(vmf->ptl);
> -		wait_on_page_locked(page);
> +		spinwait_on_page_locked(page);
>  		put_page(page);
>  		goto out;
>  	}
> diff --git a/mm/migrate.c b/mm/migrate.c
> index e84eeb4e4356..9b6c3fc5beac 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -308,7 +308,7 @@ void __migration_entry_wait(struct mm_struct *mm,
> pte_t *ptep,
>  	if (!get_page_unless_zero(page))
>  		goto out;
>  	pte_unmap_unlock(ptep, ptl);
> -	wait_on_page_locked(page);
> +	spinwait_on_page_locked(page);
>  	put_page(page);
>  	return;
>  out:
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-08-21 18:56 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-15  0:52 [PATCH 1/2] sched/wait: Break up long wake list walk Tim Chen
2017-08-15  0:52 ` [PATCH 2/2] sched/wait: Introduce lock breaker in wake_up_page_bit Tim Chen
2017-08-15  1:48 ` [PATCH 1/2] sched/wait: Break up long wake list walk Linus Torvalds
2017-08-15  2:27   ` Andi Kleen
2017-08-15  2:52     ` Linus Torvalds
2017-08-15  3:15       ` Andi Kleen
2017-08-15  3:28         ` Linus Torvalds
2017-08-15 19:05           ` Tim Chen
2017-08-15 19:41             ` Linus Torvalds
2017-08-15 19:47               ` Linus Torvalds
2017-08-15 22:47           ` Davidlohr Bueso
2017-08-15 22:56             ` Linus Torvalds
2017-08-15 22:57               ` Linus Torvalds
2017-08-15 23:50                 ` Linus Torvalds
2017-08-16 23:22                   ` Eric W. Biederman
2017-08-17 16:17   ` Liang, Kan
2017-08-17 16:25     ` Linus Torvalds
2017-08-17 20:18       ` Liang, Kan
2017-08-17 20:44         ` Linus Torvalds
2017-08-18 12:23           ` Mel Gorman
2017-08-18 14:20             ` Liang, Kan
2017-08-18 14:46               ` Mel Gorman
2017-08-18 16:36                 ` Tim Chen
2017-08-18 16:45                   ` Andi Kleen
2017-08-18 16:53                 ` Liang, Kan
2017-08-18 17:48                   ` Linus Torvalds
2017-08-18 18:54                     ` Mel Gorman
2017-08-18 19:14                       ` Linus Torvalds
2017-08-18 19:58                         ` Andi Kleen
2017-08-18 20:10                           ` Linus Torvalds
2017-08-21 18:32                         ` Mel Gorman
2017-08-21 18:56                           ` Liang, Kan [this message]
2017-08-22 17:23                             ` Liang, Kan
2017-08-22 18:19                               ` Linus Torvalds
2017-08-22 18:25                                 ` Linus Torvalds
2017-08-22 18:56                                 ` Peter Zijlstra
2017-08-22 19:15                                   ` Linus Torvalds
2017-08-22 19:08                                 ` Peter Zijlstra
2017-08-22 19:30                                   ` Linus Torvalds
2017-08-22 19:37                                     ` Andi Kleen
2017-08-22 21:08                                       ` Christopher Lameter
2017-08-22 21:24                                         ` Andi Kleen
2017-08-22 22:52                                           ` Linus Torvalds
2017-08-22 23:19                                             ` Linus Torvalds
2017-08-23 14:51                                             ` Liang, Kan
2017-08-22 19:55                                 ` Liang, Kan
2017-08-22 20:42                                   ` Linus Torvalds
2017-08-22 20:53                                     ` Peter Zijlstra
2017-08-22 20:58                                       ` Linus Torvalds
2017-08-23 14:49                                     ` Liang, Kan
2017-08-23 15:58                                       ` Tim Chen
2017-08-23 18:17                                         ` Linus Torvalds
2017-08-23 20:55                                           ` Liang, Kan
2017-08-23 23:30                                           ` Linus Torvalds
2017-08-24 17:49                                             ` Tim Chen
2017-08-24 18:16                                               ` Linus Torvalds
2017-08-24 20:44                                                 ` Mel Gorman
2017-08-25 16:44                                                   ` Tim Chen
2017-08-23 16:04                                 ` Mel Gorman
2017-08-18 20:05                     ` Andi Kleen
2017-08-18 20:29                       ` Linus Torvalds
2017-08-18 20:29                     ` Liang, Kan
2017-08-18 20:34                       ` Linus Torvalds
2017-08-18 16:55             ` Linus Torvalds
2017-08-18 13:06           ` Liang, Kan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=37D7C6CF3E00A74B8858931C1DB2F07753788B58@SHSMSX103.ccr.corp.intel.com \
    --to=kan.liang@intel.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=jack@suse.cz \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).