From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752134AbdEJHmI (ORCPT ); Wed, 10 May 2017 03:42:08 -0400 Received: from mx2.suse.de ([195.135.220.15]:44921 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752033AbdEJHmG (ORCPT ); Wed, 10 May 2017 03:42:06 -0400 Date: Wed, 10 May 2017 09:41:59 +0200 From: Michal Hocko To: Naoya Horiguchi Cc: Balbir Singh , Laurent Dufour , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "akpm@linux-foundation.org" Subject: Re: [PATCH v2 2/2] mm: skip HWPoisoned pages when onlining pages Message-ID: <20170510074159.GD31466@dhcp22.suse.cz> References: <1493130472-22843-1-git-send-email-ldufour@linux.vnet.ibm.com> <1493130472-22843-3-git-send-email-ldufour@linux.vnet.ibm.com> <1493172615.4828.3.camel@gmail.com> <20170426031255.GB11619@hori1.linux.bs1.fc.nec.co.jp> <20170428063048.GA9399@dhcp22.suse.cz> <20170428065050.GC8143@dhcp22.suse.cz> <20170428065131.GD8143@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170428065131.GD8143@dhcp22.suse.cz> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri 28-04-17 08:51:31, Michal Hocko wrote: > On Fri 28-04-17 08:50:50, Michal Hocko wrote: > > [Drop Wen Congyang because his address bounces - we will have to find > > out ourselves...] > > On Fri 28-04-17 08:30:48, Michal Hocko wrote: > > > On Wed 26-04-17 03:13:04, Naoya Horiguchi wrote: > > > > On Wed, Apr 26, 2017 at 12:10:15PM +1000, Balbir Singh wrote: > > > > > On Tue, 2017-04-25 at 16:27 +0200, Laurent Dufour wrote: > > > > > > The commit b023f46813cd ("memory-hotplug: skip HWPoisoned page when > > > > > > offlining pages") skip the HWPoisoned pages when offlining pages, but > > > > > > this should be skipped when onlining the pages too. > > > > > > > > > > > > Signed-off-by: Laurent Dufour > > > > > > --- > > > > > > mm/memory_hotplug.c | 4 ++++ > > > > > > 1 file changed, 4 insertions(+) > > > > > > > > > > > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > > > > > > index 6fa7208bcd56..741ddb50e7d2 100644 > > > > > > --- a/mm/memory_hotplug.c > > > > > > +++ b/mm/memory_hotplug.c > > > > > > @@ -942,6 +942,10 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages, > > > > > > if (PageReserved(pfn_to_page(start_pfn))) > > > > > > for (i = 0; i < nr_pages; i++) { > > > > > > page = pfn_to_page(start_pfn + i); > > > > > > + if (PageHWPoison(page)) { > > > > > > + ClearPageReserved(page); > > > > > > > > > > Why do we clear page reserved? Also if the page is marked PageHWPoison, it > > > > > was never offlined to begin with? Or do you expect this to be set on newly > > > > > hotplugged memory? Also don't we need to skip the entire pageblock? > > > > > > > > If I read correctly, to "skip HWPoiosned page" in commit b023f46813cd means > > > > that we skip the page status check for hwpoisoned pages *not* to prevent > > > > memory offlining for memblocks with hwpoisoned pages. That means that > > > > hwpoisoned pages can be offlined. > > > > > > Is this patch actually correct? I am trying to wrap my head around it > > > but it smells like it tries to avoid the problem rather than fix it > > > properly. I might be wrong here of course but to me it sounds like > > > poisoned page should simply be offlined and keep its poison state all > > > the time. If the memory is hot-removed and added again we have lost the > > > struct page along with the state which is the expected behavior. If it > > > is still broken we will re-poison it. > > > > > > Anyway a patch to skip over poisoned pages during online makes perfect > > > sense to me. The PageReserved fiddling around much less so. > > > > > > Or am I missing something. Let's CC Wen Congyang for the clarification > > > here. Can we revisit this please? The PageReserved() logic for poisoned pages is completely unclear to me. I would rather not rely on the previous changelogs and rather build the picture from what is the expected behavior instead. -- Michal Hocko SUSE Labs