linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Baoquan He <bhe@redhat.com>
Cc: David Hildenbrand <david@redhat.com>,
	linux-mm@kvack.org, pifang@redhat.com,
	linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
	aarcange@redhat.com
Subject: Re: Memory hotplug softlock issue
Date: Fri, 16 Nov 2018 10:14:09 +0100	[thread overview]
Message-ID: <20181116091409.GD14706@dhcp22.suse.cz> (raw)
In-Reply-To: <20181116012433.GU2653@MiWiFi-R3L-srv>

On Fri 16-11-18 09:24:33, Baoquan He wrote:
> On 11/15/18 at 03:32pm, Michal Hocko wrote:
> > On Thu 15-11-18 21:38:40, Baoquan He wrote:
> > > On 11/15/18 at 02:19pm, Michal Hocko wrote:
> > > > On Thu 15-11-18 21:12:11, Baoquan He wrote:
> > > > > On 11/15/18 at 09:30am, Michal Hocko wrote:
> > > > [...]
> > > > > > It would be also good to find out whether this is fs specific. E.g. does
> > > > > > it make any difference if you use a different one for your stress
> > > > > > testing?
> > > > > 
> > > > > Created a ramdisk and put stress bin there, then run stress -m 200, now
> > > > > seems it's stuck in libc-2.28.so migrating. And it's still xfs. So now xfs
> > > > > is a big suspect. At bottom I paste numactl printing, you can see that it's
> > > > > the last 4G.
> > > > > 
> > > > > Seems it's trying to migrate libc-2.28.so, but stress program keeps trying to
> > > > > access and activate it.
> > > > 
> > > > Is this still with faultaround disabled? I have seen exactly same
> > > > pattern in the bug I am working on. It was ext4 though.
> > > 
> > > After a long time struggling, the last 2nd block where libc-2.28.so is
> > > located is reclaimed, now it comes to the last memory block, still
> > > stress program itself. swap migration entry has been made and trying to
> > > unmap, now it's looping there.
> > > 
> > > [  +0.004445] migrating pfn 190ff2bb0 failed 
> > > [  +0.000013] page:ffffea643fcaec00 count:203 mapcount:201 mapping:ffff888dfb268f48 index:0x0
> > > [  +0.012809] shmem_aops 
> > > [  +0.000011] name:"stress" 
> > > [  +0.002550] flags: 0x1dfffffc008004e(referenced|uptodate|dirty|workingset|swapbacked)
> > > [  +0.010715] raw: 01dfffffc008004e ffffea643fcaec48 ffffea643fc714c8 ffff888dfb268f48
> > > [  +0.007828] raw: 0000000000000000 0000000000000000 000000cb000000c8 ffff888e72e92000
> > > [  +0.007810] page->mem_cgroup:ffff888e72e92000
> > [...]
> > > [  +0.004455] migrating pfn 190ff2bb0 failed 
> > > [  +0.000018] page:ffffea643fcaec00 count:203 mapcount:201 mapping:ffff888dfb268f48 index:0x0
> > > [  +0.014392] shmem_aops 
> > > [  +0.000010] name:"stress" 
> > > [  +0.002565] flags: 0x1dfffffc008004e(referenced|uptodate|dirty|workingset|swapbacked)
> > > [  +0.010675] raw: 01dfffffc008004e ffffea643fcaec48 ffffea643fc714c8 ffff888dfb268f48
> > > [  +0.007819] raw: 0000000000000000 0000000000000000 000000cb000000c8 ffff888e72e92000
> > > [  +0.007808] page->mem_cgroup:ffff888e72e92000
> > 
> > OK, so this is tmpfs backed code of your stree test. This just tells us
> > that this is not fs specific. Reference count is 2 more than the map
> > count which is the expected state. So the reference count must have been
> > elevated at the time when the migration was attempted. Shmem supports
> > fault around so this might be still possible (assuming it is enabled).
> > If not we really need to dig deeper. I will think of a debugging patch.
> 
> Disabled faultaround and reboot, test again, it's looping forever in the
> last block again, on node2, stress progam itself again. The weird is
> refcount seems to have been crazy, a random number now. There must be
> something going wrong.

Could you try to apply this debugging patch on top please? It will dump
stack trace for each reference count elevation for one page that fails
to migrate after multiple passes.

diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h
index 14d14beb1f7f..b64ebf253381 100644
--- a/include/linux/page_ref.h
+++ b/include/linux/page_ref.h
@@ -72,9 +72,12 @@ static inline int page_count(struct page *page)
 	return atomic_read(&compound_head(page)->_refcount);
 }
 
+struct page *page_to_track;
 static inline void set_page_count(struct page *page, int v)
 {
 	atomic_set(&page->_refcount, v);
+	if (page == page_to_track)
+		dump_stack();
 	if (page_ref_tracepoint_active(__tracepoint_page_ref_set))
 		__page_ref_set(page, v);
 }
@@ -91,6 +94,8 @@ static inline void init_page_count(struct page *page)
 static inline void page_ref_add(struct page *page, int nr)
 {
 	atomic_add(nr, &page->_refcount);
+	if (page == page_to_track)
+		dump_stack();
 	if (page_ref_tracepoint_active(__tracepoint_page_ref_mod))
 		__page_ref_mod(page, nr);
 }
@@ -105,6 +110,8 @@ static inline void page_ref_sub(struct page *page, int nr)
 static inline void page_ref_inc(struct page *page)
 {
 	atomic_inc(&page->_refcount);
+	if (page == page_to_track)
+		dump_stack();
 	if (page_ref_tracepoint_active(__tracepoint_page_ref_mod))
 		__page_ref_mod(page, 1);
 }
@@ -129,6 +136,8 @@ static inline int page_ref_inc_return(struct page *page)
 {
 	int ret = atomic_inc_return(&page->_refcount);
 
+	if (page == page_to_track)
+		dump_stack();
 	if (page_ref_tracepoint_active(__tracepoint_page_ref_mod_and_return))
 		__page_ref_mod_and_return(page, 1, ret);
 	return ret;
@@ -156,6 +165,8 @@ static inline int page_ref_add_unless(struct page *page, int nr, int u)
 {
 	int ret = atomic_add_unless(&page->_refcount, nr, u);
 
+	if (page == page_to_track)
+		dump_stack();
 	if (page_ref_tracepoint_active(__tracepoint_page_ref_mod_unless))
 		__page_ref_mod_unless(page, nr, ret);
 	return ret;
diff --git a/mm/migrate.c b/mm/migrate.c
index f7e4bfdc13b7..9b2e395a3d68 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1338,6 +1338,8 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
 	return rc;
 }
 
+struct page *page_to_track;
+
 /*
  * migrate_pages - migrate the pages specified in a list, to the free pages
  *		   supplied as the target for the page migration
@@ -1375,6 +1377,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 	if (!swapwrite)
 		current->flags |= PF_SWAPWRITE;
 
+	page_to_track = NULL;
 	for(pass = 0; pass < 10 && retry; pass++) {
 		retry = 0;
 
@@ -1417,6 +1420,8 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
 				goto out;
 			case -EAGAIN:
 				retry++;
+				if (pass > 1 && !page_to_track)
+					page_to_track = page;
 				break;
 			case MIGRATEPAGE_SUCCESS:
 				nr_succeeded++;
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2018-11-16  9:14 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-14  7:09 Memory hotplug softlock issue Baoquan He
2018-11-14  7:16 ` Baoquan He
2018-11-14  8:18 ` David Hildenbrand
2018-11-14  9:00   ` Baoquan He
2018-11-14  9:25     ` David Hildenbrand
2018-11-14  9:41       ` Michal Hocko
2018-11-14  9:48         ` David Hildenbrand
2018-11-14 10:04           ` Michal Hocko
2018-11-14  9:01   ` Michal Hocko
2018-11-14  9:22     ` David Hildenbrand
2018-11-14  9:37       ` Michal Hocko
2018-11-14  9:39         ` David Hildenbrand
2018-11-14 14:52     ` Baoquan He
2018-11-14 15:00       ` Michal Hocko
2018-11-15  5:10         ` Baoquan He
2018-11-15  7:30           ` Michal Hocko
2018-11-15  7:53             ` Baoquan He
2018-11-15  8:30               ` Michal Hocko
2018-11-15  9:42                 ` David Hildenbrand
2018-11-15  9:52                   ` Baoquan He
2018-11-15  9:53                     ` David Hildenbrand
2018-11-15 13:12                 ` Baoquan He
2018-11-15 13:19                   ` Michal Hocko
2018-11-15 13:23                     ` Baoquan He
2018-11-15 14:25                       ` Michal Hocko
2018-11-15 13:38                     ` Baoquan He
2018-11-15 14:32                       ` Michal Hocko
2018-11-15 14:34                         ` Baoquan He
2018-11-16  1:24                         ` Baoquan He
2018-11-16  9:14                           ` Michal Hocko [this message]
2018-11-17  4:22                             ` Baoquan He
     [not found]                             ` <20181119105202.GE18471@MiWiFi-R3L-srv>
2018-11-19 12:40                               ` Michal Hocko
2018-11-19 12:51                                 ` Michal Hocko
2018-11-19 14:10                                   ` Michal Hocko
2018-11-19 16:36                                     ` Vlastimil Babka
2018-11-19 16:46                                       ` Michal Hocko
2018-11-19 16:46                                         ` Vlastimil Babka
2018-11-19 16:48                                           ` Vlastimil Babka
2018-11-19 17:01                                             ` Michal Hocko
2018-11-19 17:33                                     ` Michal Hocko
2018-11-19 20:34                                       ` Hugh Dickins
2018-11-19 20:59                                         ` Michal Hocko
2018-11-20  1:56                                           ` Baoquan He
2018-11-20  5:44                                             ` Hugh Dickins
2018-11-20 13:38                                               ` Vlastimil Babka
2018-11-20 13:58                                                 ` Baoquan He
2018-11-20 14:05                                                   ` Michal Hocko
2018-11-20 14:12                                                     ` Baoquan He
2018-11-21  1:21                                                   ` Hugh Dickins
2018-11-21  1:08                                                 ` Hugh Dickins
2018-11-21  3:20                                                   ` Hugh Dickins
2018-11-21 17:31                                               ` Michal Hocko
2018-11-22  1:53                                                 ` Hugh Dickins
2018-11-14 10:00 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181116091409.GD14706@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=pifang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).