From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.5 required=3.0 tests=INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF95AECE58D for ; Thu, 17 Oct 2019 18:28:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 86AEE20854 for ; Thu, 17 Oct 2019 18:28:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 86AEE20854 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 365AA8E000F; Thu, 17 Oct 2019 14:28:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 315138E0003; Thu, 17 Oct 2019 14:28:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 22C848E000F; Thu, 17 Oct 2019 14:28:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0095.hostedemail.com [216.40.44.95]) by kanga.kvack.org (Postfix) with ESMTP id 013D78E0003 for ; Thu, 17 Oct 2019 14:28:03 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id AEECF2497 for ; Thu, 17 Oct 2019 18:28:03 +0000 (UTC) X-FDA: 76054110846.08.woman58_808c3f3b4bb39 X-HE-Tag: woman58_808c3f3b4bb39 X-Filterd-Recvd-Size: 3272 Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Thu, 17 Oct 2019 18:28:03 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id E878FB47D; Thu, 17 Oct 2019 18:28:01 +0000 (UTC) Date: Thu, 17 Oct 2019 20:27:59 +0200 From: Michal Hocko To: Naoya Horiguchi Cc: Qian Cai , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , David Hildenbrand , Mike Kravetz Subject: Re: memory offline infinite loop after soft offline Message-ID: <20191017182759.GN24485@dhcp22.suse.cz> References: <1570829564.5937.36.camel@lca.pw> <20191014083914.GA317@dhcp22.suse.cz> <20191017093410.GA19973@hori.linux.bs1.fc.nec.co.jp> <20191017100106.GF24485@dhcp22.suse.cz> <1571335633.5937.69.camel@lca.pw> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <1571335633.5937.69.camel@lca.pw> User-Agent: Mutt/1.10.1 (2018-07-13) Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu 17-10-19 14:07:13, Qian Cai wrote: > On Thu, 2019-10-17 at 12:01 +0200, Michal Hocko wrote: > > On Thu 17-10-19 09:34:10, Naoya Horiguchi wrote: > > > On Mon, Oct 14, 2019 at 10:39:14AM +0200, Michal Hocko wrote: > >=20 > > [...] > > > > diff --git a/mm/page_isolation.c b/mm/page_isolation.c > > > > index 89c19c0feadb..5fb3fee16fde 100644 > > > > --- a/mm/page_isolation.c > > > > +++ b/mm/page_isolation.c > > > > @@ -274,7 +274,7 @@ __test_page_isolated_in_pageblock(unsigned lo= ng pfn, unsigned long end_pfn, > > > > * simple way to verify that as VM_BUG_ON(), though. > > > > */ > > > > pfn +=3D 1 << page_order(page); > > > > - else if (skip_hwpoisoned_pages && PageHWPoison(page)) > > > > + else if (skip_hwpoisoned_pages && PageHWPoison(compound_head(p= age))) > > > > /* A HWPoisoned page cannot be also PageBuddy */ > > > > pfn++; > > > > else > > >=20 > > > This fix looks good to me. The original code only addresses hwpoiso= ned 4kB-page, > > > we seem to have this issue since the following commit, > >=20 > > Thanks a lot for double checking Naoya! > > =20 > > > commit b023f46813cde6e3b8a8c24f432ff9c1fd8e9a64 > > > Author: Wen Congyang > > > Date: Tue Dec 11 16:00:45 2012 -0800 > > > =20 > > > memory-hotplug: skip HWPoisoned page when offlining pages > > >=20 > > > and extension of LTP coverage finally discovered this. > >=20 > > Qian, could you give the patch some testing? >=20 > Unfortunately, this does not solve the problem.=A0It looks to me that i= n > soft_offline_huge_page(), set_hwpoison_free_buddy_page() will only set > PG_hwpoison for buddy pages, so the even the compound_head() has no PG_= hwpoison > set. >=20 > if (PageBuddy(page_head) && page_order(page_head) >=3D order) { > if (!TestSetPageHWPoison(page)) > hwpoisoned =3D true; This is more than unexpected. How are we supposed to find out that the page is poisoned? Any idea Naoya? --=20 Michal Hocko SUSE Labs