From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19088C433ED for ; Wed, 21 Apr 2021 08:21:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AC5D86142C for ; Wed, 21 Apr 2021 08:21:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AC5D86142C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0F76F6B006C; Wed, 21 Apr 2021 04:21:10 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0CEFF6B006E; Wed, 21 Apr 2021 04:21:10 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EFFEF6B0070; Wed, 21 Apr 2021 04:21:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0078.hostedemail.com [216.40.44.78]) by kanga.kvack.org (Postfix) with ESMTP id D87336B006C for ; Wed, 21 Apr 2021 04:21:09 -0400 (EDT) Received: from smtpin40.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 91F79824999B for ; Wed, 21 Apr 2021 08:21:09 +0000 (UTC) X-FDA: 78055679058.40.0B57509 Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by imf08.hostedemail.com (Postfix) with ESMTP id C39E780192E7 for ; Wed, 21 Apr 2021 08:20:49 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 11AAAB176; Wed, 21 Apr 2021 08:21:07 +0000 (UTC) Date: Wed, 21 Apr 2021 10:21:03 +0200 From: Oscar Salvador To: Muchun Song Cc: Michal Hocko , Mike Kravetz , Andrew Morton , Linux Memory Management List , LKML , Naoya Horiguchi Subject: Re: [External] Re: [PATCH] mm: hugetlb: fix a race between memory-failure/soft_offline and gather_surplus_pages Message-ID: <20210421082103.GE22456@linux> References: <20210421060259.67554-1-songmuchun@bytedance.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: C39E780192E7 X-Stat-Signature: o1psikpeej7qoofffuoj3azf44yb6tds Received-SPF: none (suse.de>: No applicable sender policy available) receiver=imf08; identity=mailfrom; envelope-from=""; helo=mx2.suse.de; client-ip=195.135.220.15 X-HE-DKIM-Result: none/none X-HE-Tag: 1618993249-418752 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 21, 2021 at 04:15:00PM +0800, Muchun Song wrote: > > The hwpoison side of this looks really suspicious to me. It shouldn't > > really touch the reference count of hugetlb pages without being very > > careful (and having hugetlb_lock held). What would happen if the > > reference count was increased after the page has been enqueed into the > > pool? This can just blow up later. > > If the page has been enqueued into the pool, then the page can be > allocated to other users. The page reference count will be reset to > 1 in the dequeue_huge_page_node_exact(). Then memory-failure > will free the page because of put_page(). This is wrong. Because > there is another user. Note that dequeue_huge_page_node_exact() will not hand over any pages which are poisoned, so in this case it will not be allocated. But it is true that we might need hugetlb lock, this needs some more thought. I will have a look. -- Oscar Salvador SUSE L3