From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.5 required=3.0 tests=INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6FE27CA9EA0 for ; Fri, 18 Oct 2019 07:33:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3F06321925 for ; Fri, 18 Oct 2019 07:33:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3F06321925 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 275C58E000C; Fri, 18 Oct 2019 03:33:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 226088E0003; Fri, 18 Oct 2019 03:33:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 13CC18E000C; Fri, 18 Oct 2019 03:33:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0104.hostedemail.com [216.40.44.104]) by kanga.kvack.org (Postfix) with ESMTP id E64C68E0003 for ; Fri, 18 Oct 2019 03:33:13 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id 88AF32C68 for ; Fri, 18 Oct 2019 07:33:13 +0000 (UTC) X-FDA: 76056089466.24.lace50_9082d70b6c903 X-HE-Tag: lace50_9082d70b6c903 X-Filterd-Recvd-Size: 5183 Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by imf28.hostedemail.com (Postfix) with ESMTP for ; Fri, 18 Oct 2019 07:33:12 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 0BB85B6A5; Fri, 18 Oct 2019 07:33:11 +0000 (UTC) Date: Fri, 18 Oct 2019 09:33:10 +0200 From: Michal Hocko To: Naoya Horiguchi Cc: Qian Cai , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , David Hildenbrand , Mike Kravetz Subject: Re: memory offline infinite loop after soft offline Message-ID: <20191018073310.GB5017@dhcp22.suse.cz> References: <1570829564.5937.36.camel@lca.pw> <20191014083914.GA317@dhcp22.suse.cz> <20191017093410.GA19973@hori.linux.bs1.fc.nec.co.jp> <20191017100106.GF24485@dhcp22.suse.cz> <1571335633.5937.69.camel@lca.pw> <20191017182759.GN24485@dhcp22.suse.cz> <20191018021906.GA24978@hori.linux.bs1.fc.nec.co.jp> <20191018060635.GA5017@dhcp22.suse.cz> <20191018063222.GA15406@hori.linux.bs1.fc.nec.co.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20191018063222.GA15406@hori.linux.bs1.fc.nec.co.jp> User-Agent: Mutt/1.10.1 (2018-07-13) Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri 18-10-19 06:32:22, Naoya Horiguchi wrote: > On Fri, Oct 18, 2019 at 08:06:35AM +0200, Michal Hocko wrote: > > On Fri 18-10-19 02:19:06, Naoya Horiguchi wrote: > > > On Thu, Oct 17, 2019 at 08:27:59PM +0200, Michal Hocko wrote: > > > > On Thu 17-10-19 14:07:13, Qian Cai wrote: > > > > > On Thu, 2019-10-17 at 12:01 +0200, Michal Hocko wrote: > > > > > > On Thu 17-10-19 09:34:10, Naoya Horiguchi wrote: > > > > > > > On Mon, Oct 14, 2019 at 10:39:14AM +0200, Michal Hocko wrot= e: > > > > > >=20 > > > > > > [...] > > > > > > > > diff --git a/mm/page_isolation.c b/mm/page_isolation.c > > > > > > > > index 89c19c0feadb..5fb3fee16fde 100644 > > > > > > > > --- a/mm/page_isolation.c > > > > > > > > +++ b/mm/page_isolation.c > > > > > > > > @@ -274,7 +274,7 @@ __test_page_isolated_in_pageblock(uns= igned long pfn, unsigned long end_pfn, > > > > > > > > * simple way to verify that as VM_BUG_ON(), though. > > > > > > > > */ > > > > > > > > pfn +=3D 1 << page_order(page); > > > > > > > > - else if (skip_hwpoisoned_pages && PageHWPoison(page)) > > > > > > > > + else if (skip_hwpoisoned_pages && PageHWPoison(compoun= d_head(page))) > > > > > > > > /* A HWPoisoned page cannot be also PageBuddy */ > > > > > > > > pfn++; > > > > > > > > else > > > > > > >=20 > > > > > > > This fix looks good to me. The original code only addresses= hwpoisoned 4kB-page, > > > > > > > we seem to have this issue since the following commit, > > > > > >=20 > > > > > > Thanks a lot for double checking Naoya! > > > > > > =20 > > > > > > > commit b023f46813cde6e3b8a8c24f432ff9c1fd8e9a64 > > > > > > > Author: Wen Congyang > > > > > > > Date: Tue Dec 11 16:00:45 2012 -0800 > > > > > > > =20 > > > > > > > memory-hotplug: skip HWPoisoned page when offlining p= ages > > > > > > >=20 > > > > > > > and extension of LTP coverage finally discovered this. > > > > > >=20 > > > > > > Qian, could you give the patch some testing? > > > > >=20 > > > > > Unfortunately, this does not solve the problem.=A0It looks to m= e that in > > > > > soft_offline_huge_page(), set_hwpoison_free_buddy_page() will o= nly set > > > > > PG_hwpoison for buddy pages, so the even the compound_head() ha= s no PG_hwpoison > > > > > set. > > > > >=20 > > > > > if (PageBuddy(page_head) && page_order(page_head) >=3D order)= { > > > > > if (!TestSetPageHWPoison(page)) > > > > > hwpoisoned =3D true; > > > >=20 > > > > This is more than unexpected. How are we supposed to find out tha= t the > > > > page is poisoned? Any idea Naoya? > > >=20 > > > # sorry for my poor review... > > >=20 > > > We set PG_hwpoison bit only on the head page for hugetlb, that's be= cause > > > we handle multiple pages as a single one for hugetlb. So it's enoug= h > > > to check isolation only on the head page. Simply skipping pfn curs= or to > > > the page after the hugepage should avoid the infinite loop: > >=20 > > But the page dump Qian provided shows that the head page doesn't have > > HWPoison bit either. If it had then going pfn at a time should just w= ork > > because all tail pages would be skipped. Or do I miss something? >=20 > You're right, then I don't see how this happens. OK, this is a bit relieving. I thought that there are legitimate cases when none of the hugetlb gets the HWPoison bit (e.g. when the page has 0 reference count which is the case here). That would be utterly broken because we would have no way to tell the page is hwpoisoned. Anyway, do you think the patch as I've posted makes sense regardless another potential problem? Or would you like to resend yours which skips over tail pages at once? --=20 Michal Hocko SUSE Labs