All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Baoquan He <bhe@redhat.com>
Cc: Hugh Dickins <hughd@google.com>, Vlastimil Babka <vbabka@suse.cz>,
	pifang@redhat.com, David Hildenbrand <david@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, aarcange@redhat.com,
	Mel Gorman <mgorman@suse.de>
Subject: Re: Memory hotplug softlock issue
Date: Tue, 20 Nov 2018 15:05:24 +0100	[thread overview]
Message-ID: <20181120140524.GI22247@dhcp22.suse.cz> (raw)
In-Reply-To: <20181120135803.GA3369@MiWiFi-R3L-srv>

On Tue 20-11-18 21:58:03, Baoquan He wrote:
> Hi,
> 
> On 11/20/18 at 02:38pm, Vlastimil Babka wrote:
> > On 11/20/18 6:44 AM, Hugh Dickins wrote:
> > > [PATCH] mm: put_and_wait_on_page_locked() while page is migrated
> > > 
> > > We have all assumed that it is essential to hold a page reference while
> > > waiting on a page lock: partly to guarantee that there is still a struct
> > > page when MEMORY_HOTREMOVE is configured, but also to protect against
> > > reuse of the struct page going to someone who then holds the page locked
> > > indefinitely, when the waiter can reasonably expect timely unlocking.
> > > 
> > > But in fact, so long as wait_on_page_bit_common() does the put_page(),
> > > and is careful not to rely on struct page contents thereafter, there is
> > > no need to hold a reference to the page while waiting on it.  That does
> > 
> > So there's still a moment where refcount is elevated, but hopefully
> > short enough, right? Let's see if it survives Baoquan's stress testing.
> 
> Yes, I applied Hugh's patch 8 hours ago, then our QE Ping operated on
> that machine, after many times of hot removing/adding, the endless
> looping during mirgrating is not seen any more. The test result for
> Hugh's patch is positive. I even suggested Ping increasing the memory
> pressure to "stress -m 250", it still succeeded to offline and remove.
> 
> So I think this patch works to solve the issue. Thanks a lot for your
> help, all of you. 

This is a great news! Thanks for your swift feedback. I will go and try
to review Hugh's patch soon.

> High, will you post a formal patch in a separate thread?
> 
> Meanwhile we found sometime onlining page may not add back all memory
> blocks on one memory board, then hot removing/adding them will cause
> kernel panic. I will investigate further and collect information, see if
> it's a kernel issue or udev issue.

It would be great to get a report in a new email thread.
> 
> Thanks
> Baoquan
> 
> > 
> > > mean that this case cannot go back through the loop: but that's fine for
> > > the page migration case, and even if used more widely, is limited by the
> > > "Stop walking if it's locked" optimization in wake_page_function().
> > > 
> > > Add interface put_and_wait_on_page_locked() to do this, using negative
> > > value of the lock arg to wait_on_page_bit_common() to implement it.
> > > No interruptible or killable variant needed yet, but they might follow:
> > > I have a vague notion that reporting -EINTR should take precedence over
> > > return from wait_on_page_bit_common() without knowing the page state,
> > > so arrange it accordingly - but that may be nothing but pedantic.
> > > 
> > > shrink_page_list()'s __ClearPageLocked(): that was a surprise! this
> > > survived a lot of testing before that showed up.  It does raise the
> > > question: should is_page_cache_freeable() and __remove_mapping() now
> > > treat a PG_waiters page as if an extra reference were held?  Perhaps,
> > > but I don't think it matters much, since shrink_page_list() already
> > > had to win its trylock_page(), so waiters are not very common there: I
> > > noticed no difference when trying the bigger change, and it's surely not
> > > needed while put_and_wait_on_page_locked() is only for page migration.
> > > 
> > > Signed-off-by: Hugh Dickins <hughd@google.com>
> > > ---
> > 
> > ...
> > 
> > > @@ -1100,6 +1111,17 @@ static inline int wait_on_page_bit_common(wait_queue_head_t *q,
> > >  			ret = -EINTR;
> > >  			break;
> > >  		}
> > > +
> > > +		if (lock < 0) {
> > > +			/*
> > > +			 * We can no longer safely access page->flags:
> > 
> > Hmm...
> > 
> > > +			 * even if CONFIG_MEMORY_HOTREMOVE is not enabled,
> > > +			 * there is a risk of waiting forever on a page reused
> > > +			 * for something that keeps it locked indefinitely.
> > > +			 * But best check for -EINTR above before breaking.
> > > +			 */
> > > +			break;
> > > +		}
> > >  	}
> > >  
> > >  	finish_wait(q, wait);
> > 
> > ... the code continues by:
> > 
> >         if (thrashing) {
> >                 if (!PageSwapBacked(page))
> > 
> > So maybe we should not set 'thrashing' true when lock < 0?
> > 
> > Thanks!
> > Vlastimil

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2018-11-20 14:05 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-14  7:09 Memory hotplug softlock issue Baoquan He
2018-11-14  7:16 ` Baoquan He
2018-11-14  7:16   ` Baoquan He
2018-11-14  8:18 ` David Hildenbrand
2018-11-14  9:00   ` Baoquan He
2018-11-14  9:25     ` David Hildenbrand
2018-11-14  9:41       ` Michal Hocko
2018-11-14  9:48         ` David Hildenbrand
2018-11-14 10:04           ` Michal Hocko
2018-11-14  9:01   ` Michal Hocko
2018-11-14  9:22     ` David Hildenbrand
2018-11-14  9:37       ` Michal Hocko
2018-11-14  9:39         ` David Hildenbrand
2018-11-14 14:52     ` Baoquan He
2018-11-14 15:00       ` Michal Hocko
2018-11-15  5:10         ` Baoquan He
2018-11-15  7:30           ` Michal Hocko
2018-11-15  7:53             ` Baoquan He
2018-11-15  8:30               ` Michal Hocko
2018-11-15  9:42                 ` David Hildenbrand
2018-11-15  9:52                   ` Baoquan He
2018-11-15  9:53                     ` David Hildenbrand
2018-11-15 13:12                 ` Baoquan He
2018-11-15 13:19                   ` Michal Hocko
2018-11-15 13:23                     ` Baoquan He
2018-11-15 14:25                       ` Michal Hocko
2018-11-15 13:38                     ` Baoquan He
2018-11-15 14:32                       ` Michal Hocko
2018-11-15 14:34                         ` Baoquan He
2018-11-16  1:24                         ` Baoquan He
2018-11-16  9:14                           ` Michal Hocko
2018-11-17  4:22                             ` Baoquan He
2018-11-19 10:52                             ` Baoquan He
2018-11-19 12:40                               ` Michal Hocko
2018-11-19 12:51                                 ` Michal Hocko
2018-11-19 14:10                                   ` Michal Hocko
2018-11-19 16:36                                     ` Vlastimil Babka
2018-11-19 16:46                                       ` Michal Hocko
2018-11-19 16:46                                         ` Vlastimil Babka
2018-11-19 16:48                                           ` Vlastimil Babka
2018-11-19 17:01                                             ` Michal Hocko
2018-11-19 17:33                                     ` Michal Hocko
2018-11-19 20:34                                       ` Hugh Dickins
2018-11-19 20:59                                         ` Michal Hocko
2018-11-20  1:56                                           ` Baoquan He
2018-11-20  5:44                                             ` Hugh Dickins
2018-11-20 13:38                                               ` Vlastimil Babka
2018-11-20 13:58                                                 ` Baoquan He
2018-11-20 13:58                                                   ` Baoquan He
2018-11-20 14:05                                                   ` Michal Hocko [this message]
2018-11-20 14:12                                                     ` Baoquan He
2018-11-21  1:21                                                   ` Hugh Dickins
2018-11-21  1:08                                                 ` Hugh Dickins
2018-11-21  3:20                                                   ` Hugh Dickins
2018-11-21 17:31                                               ` Michal Hocko
2018-11-22  1:53                                                 ` Hugh Dickins
2018-11-14 10:00 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181120140524.GI22247@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=pifang@redhat.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.