From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB0BCC4338F for ; Thu, 12 Aug 2021 04:28:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5E18360EFE for ; Thu, 12 Aug 2021 04:28:19 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 5E18360EFE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 574076B006C; Thu, 12 Aug 2021 00:28:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5243E6B0071; Thu, 12 Aug 2021 00:28:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 412988D0001; Thu, 12 Aug 2021 00:28:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0065.hostedemail.com [216.40.44.65]) by kanga.kvack.org (Postfix) with ESMTP id 2B9416B006C for ; Thu, 12 Aug 2021 00:28:18 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id A8D401F36B for ; Thu, 12 Aug 2021 04:28:17 +0000 (UTC) X-FDA: 78465146634.26.92D577E Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by imf26.hostedemail.com (Postfix) with ESMTP id 1907720189D9 for ; Thu, 12 Aug 2021 04:28:15 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10073"; a="195543540" X-IronPort-AV: E=Sophos;i="5.84,314,1620716400"; d="scan'208";a="195543540" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2021 21:28:14 -0700 X-IronPort-AV: E=Sophos;i="5.84,314,1620716400"; d="scan'208";a="527594245" Received: from agluck-desk2.sc.intel.com (HELO agluck-desk2.amr.corp.intel.com) ([10.3.52.146]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Aug 2021 21:28:14 -0700 Date: Wed, 11 Aug 2021 21:28:13 -0700 From: "Luck, Tony" To: Naoya Horiguchi Cc: Oscar Salvador , Muchun Song , Mike Kravetz , linux-mm@kvack.org, Andrew Morton , Michal Hocko , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: Re: [PATCH v6 1/2] mm,hwpoison: fix race with hugetlb page allocation Message-ID: <20210812042813.GA1576603@agluck-desk2.amr.corp.intel.com> References: <20210603233632.2964832-1-nao.horiguchi@gmail.com> <20210603233632.2964832-2-nao.horiguchi@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210603233632.2964832-2-nao.horiguchi@gmail.com> X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 1907720189D9 X-Stat-Signature: pjktk8xswrjup6qz6rghhqfygfxn7c6o Authentication-Results: imf26.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=intel.com (policy=none); spf=none (imf26.hostedemail.com: domain of tony.luck@intel.com has no SPF policy when checking 192.55.52.151) smtp.mailfrom=tony.luck@intel.com X-HE-Tag: 1628742495-81250 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jun 04, 2021 at 08:36:31AM +0900, Naoya Horiguchi wrote: > From: Naoya Horiguchi > > When hugetlb page fault (under overcommitting situation) and > memory_failure() race, VM_BUG_ON_PAGE() is triggered by the following race: > > CPU0: CPU1: > > gather_surplus_pages() > page = alloc_surplus_huge_page() > memory_failure_hugetlb() > get_hwpoison_page(page) > __get_hwpoison_page(page) > get_page_unless_zero(page) > zero = put_page_testzero(page) > VM_BUG_ON_PAGE(!zero, page) > enqueue_huge_page(h, page) > put_page(page) > > __get_hwpoison_page() only checks the page refcount before taking an > additional one for memory error handling, which is not enough because > there's a time window where compound pages have non-zero refcount during > hugetlb page initialization. > > So make __get_hwpoison_page() check page status a bit more for hugetlb > pages with get_hwpoison_huge_page(). Checking hugetlb-specific flags > under hugetlb_lock makes sure that the hugetlb page is not transitive. > It's notable that another new function, HWPoisonHandlable(), is helpful > to prevent a race against other transitive page states (like a generic > compound page just before PageHuge becomes true). I'm seeing some strange results when doing a simple injection/recovery. Current upstream often fails to offline the page with messages like: "high-order kernel page" or "unknown page" Things were working in v5.12. Broken in v5.13. Bisect says that: 25182f05ffed ("mm,hwpoison: fix race with hugetlb page allocation") is the culprit (though it is possible that there is more than one issue ... failure symptoms changed a bit during the bisection). This commit doesn't revert automatically from upstream. But it does revert from v5.13. Running with this reverted from v5.13 gives kernel that recovers normally[1] from hundreds of consecutive error injections. -Tony [1] Almost normally. My test catches SIGBUS and prints the virtual address from the siginfo_t structure. Sometimes the address is correct other times it is NULL.