From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88A44C43457 for ; Fri, 16 Oct 2020 02:44:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 04B8720878 for ; Fri, 16 Oct 2020 02:44:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="qPuXklC6" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 04B8720878 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 84CEE94001A; Thu, 15 Oct 2020 22:44:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7FD0C940007; Thu, 15 Oct 2020 22:44:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 713C894001A; Thu, 15 Oct 2020 22:44:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0236.hostedemail.com [216.40.44.236]) by kanga.kvack.org (Postfix) with ESMTP id 41A00940007 for ; Thu, 15 Oct 2020 22:44:20 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id D0E9E33C4 for ; Fri, 16 Oct 2020 02:44:19 +0000 (UTC) X-FDA: 77376244638.08.pipe20_360beda27219 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin08.hostedemail.com (Postfix) with ESMTP id AE57F1819E766 for ; Fri, 16 Oct 2020 02:44:19 +0000 (UTC) X-HE-Tag: pipe20_360beda27219 X-Filterd-Recvd-Size: 3758 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf19.hostedemail.com (Postfix) with ESMTP for ; Fri, 16 Oct 2020 02:44:19 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 93E17208E4; Fri, 16 Oct 2020 02:44:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1602816258; bh=aT98i8k5VMat1KATyYmggdZe6+V54kX0aaVv24RkXoM=; h=Date:From:To:Subject:In-Reply-To:From; b=qPuXklC6oSkXNBqZYSRNjV8QpdlKjerNipoPOLchw8DXqLXOM4gbeLf5kDZxOU5U/ SwD8V70RwJtxqmmxzHXro3uQf3MP51uy8T9Yjql+G3+IX9KdAHqIkfOECLp5YS40R8 Rfd4jO0BmZHBMzhlF1jZW287iiKwiV8AnWk3pK7k= Date: Thu, 15 Oct 2020 19:44:16 -0700 From: Andrew Morton To: akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, aneesh.kumar@linux.vnet.ibm.com, aris@ruivo.org, cai@lca.pw, dave.hansen@intel.com, david@redhat.com, linux-mm@kvack.org, mhocko@kernel.org, mike.kravetz@oracle.com, mm-commits@vger.kernel.org, naoya.horiguchi@nec.com, osalvador@suse.com, osalvador@suse.de, tony.luck@intel.com, torvalds@linux-foundation.org, zeil@yandex-team.ru Subject: [patch 054/156] mm,hwpoison: try to narrow window race for free pages Message-ID: <20201016024416.RNt5orXte%akpm@linux-foundation.org> In-Reply-To: <20201015192732.f448da14e9854c7cb7299956@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Oscar Salvador Subject: mm,hwpoison: try to narrow window race for free pages Aristeu Rozanski reported that a customer test case started to report -EBUSY after the hwpoison rework patchset. There is a race window between spotting a free page and taking it off its buddy freelist, so it might be that by the time we try to take it off, the page has been already allocated. This patch tries to handle such race window by trying to handle the new type of page again if the page was allocated under us. Link: https://lkml.kernel.org/r/20200922135650.1634-15-osalvador@suse.de Signed-off-by: Oscar Salvador Reported-by: Aristeu Rozanski Tested-by: Aristeu Rozanski Acked-by: Naoya Horiguchi Cc: "Aneesh Kumar K.V" Cc: Aneesh Kumar K.V Cc: Dave Hansen Cc: David Hildenbrand Cc: Dmitry Yakunin Cc: Michal Hocko Cc: Mike Kravetz Cc: Oscar Salvador Cc: Qian Cai Cc: Tony Luck Signed-off-by: Andrew Morton --- mm/memory-failure.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) --- a/mm/memory-failure.c~mmhwpoison-try-to-narrow-window-race-for-free-pages +++ a/mm/memory-failure.c @@ -1903,6 +1903,7 @@ int soft_offline_page(unsigned long pfn, { int ret; struct page *page; + bool try_again = true; if (!pfn_valid(pfn)) return -ENXIO; @@ -1918,6 +1919,7 @@ int soft_offline_page(unsigned long pfn, return 0; } +retry: get_online_mems(); ret = get_any_page(page, pfn, flags); put_online_mems(); @@ -1925,7 +1927,10 @@ int soft_offline_page(unsigned long pfn, if (ret > 0) ret = soft_offline_in_use_page(page); else if (ret == 0) - ret = soft_offline_free_page(page); + if (soft_offline_free_page(page) && try_again) { + try_again = false; + goto retry; + } return ret; } _