From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A601C433F5 for ; Thu, 19 May 2022 21:07:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 022496B0071; Thu, 19 May 2022 17:07:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F11AA6B0072; Thu, 19 May 2022 17:07:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DB1FE6B0073; Thu, 19 May 2022 17:07:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id CC2EE6B0071 for ; Thu, 19 May 2022 17:07:05 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id A04B1612E8 for ; Thu, 19 May 2022 21:07:05 +0000 (UTC) X-FDA: 79483727610.13.24CCE06 Received: from mail-lj1-f181.google.com (mail-lj1-f181.google.com [209.85.208.181]) by imf09.hostedemail.com (Postfix) with ESMTP id 60F8F1400CA for ; Thu, 19 May 2022 21:06:53 +0000 (UTC) Received: by mail-lj1-f181.google.com with SMTP id a23so7605777ljd.9 for ; Thu, 19 May 2022 14:07:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=axWPy7YcnsuFBx8+grDR9Qu5vMMacXXpQgrKtbDlvZY=; b=BYCGqjUkAcMZDqw4kW7/0uGS0GXP1NU2sKkrazkNIcP9yIMwA1DslSJ3qaphfmtd91 G1fqp0Z5GL0F0yC/SWVYSFIZs6BL/z3cUpPO4H/MbH+xvqV0LuW5jgl9Z9bk3smicgrI 29lPEtbBnMw52O3WVuRCv1dElkJgJtHJ7mc0jeIbnxm13wqX6xzmGEBgvVyuD+df9pme ehwLhucGOPSZavo2aMDTgyiM1Uw4BFfa525qmjsp0XIK2eHxMeo4Prl3YKntLaO2ameM t2lZLTfDzRnxDiYN0DZW2T1t1l+ax4qRZJgHu8HUafawBPxkPoq4sgJ75F+PWjpGrPZ8 AX5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=axWPy7YcnsuFBx8+grDR9Qu5vMMacXXpQgrKtbDlvZY=; b=1/0QFk83NZKRMXoC1/7gFdor9HSxhWlmPuaaKCnKqNG7MpwgsIAT5k7+II4wWLEW1S w1melXbvDzDsliK9uZ8nAsxUnBHCJHmmGOADx381X487gQ8ZXFlF3q5+/vYqlU0lLwm8 CInDb5nERmNvZF4aRPRda/GuA3tIUJvapfA8vKToEmM5s780FskrkPQEHm2tc4qqvtKD Kp4vUHCWprMQsuj6MGA2FQuo2Yyq0YmMlRZD+lZQKKZAf6KcVddM5t8Ue/nRRGn0EPo/ nKuhn//XzqH2bQ9p++NFueKyyP4Z+6J+DkgL19fQPVDWQKo43Oa7C2wf9JldSS2HZY0Z SySw== X-Gm-Message-State: AOAM533EqjxJ9vUUl0htuHgNHkNVxsCU4fIOG2Sh8jYXge2Sm/0Vf3nC 6w9r6Brhq6FL3kLbOiD2/+Xq8IGvhQua2t0kDwgNrw== X-Google-Smtp-Source: ABdhPJwriWYcZ/E2QPFOe7CzpNwpBT/pEoiEC4I9PAfRv0+COg19TsZiRzU/KYw+s7+6wjTpROBHcgyo6B5RgDNsYGg= X-Received: by 2002:a05:651c:1508:b0:250:5b32:55d5 with SMTP id e8-20020a05651c150800b002505b3255d5mr3708923ljf.278.1652994422212; Thu, 19 May 2022 14:07:02 -0700 (PDT) MIME-Version: 1.0 References: <20220504214437.2850685-1-zokeefe@google.com> <20220504214437.2850685-2-zokeefe@google.com> In-Reply-To: From: "Zach O'Keefe" Date: Thu, 19 May 2022 14:06:25 -0700 Message-ID: Subject: Re: [PATCH v5 01/13] mm/khugepaged: record SCAN_PMD_MAPPED when scan_pmd() finds THP To: Peter Xu Cc: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org, Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer Content-Type: text/plain; charset="UTF-8" X-Stat-Signature: se7n4x5ie67h573j75uuh8w9p1km5mne Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=BYCGqjUk; spf=pass (imf09.hostedemail.com: domain of zokeefe@google.com designates 209.85.208.181 as permitted sender) smtp.mailfrom=zokeefe@google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 60F8F1400CA X-HE-Tag: 1652994413-804714 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Thanks again for the review, Peter. On Wed, May 18, 2022 at 11:41 AM Peter Xu wrote: > > On Wed, May 04, 2022 at 02:44:25PM -0700, Zach O'Keefe wrote: > > +static int find_pmd_or_thp_or_none(struct mm_struct *mm, > > + unsigned long address, > > + pmd_t **pmd) > > +{ > > + pmd_t pmde; > > + > > + *pmd = mm_find_pmd_raw(mm, address); > > + if (!*pmd) > > + return SCAN_PMD_NULL; > > + > > + pmde = pmd_read_atomic(*pmd); > > It seems to be correct on using the atomic fetcher here. Though irrelevant > to this patchset.. does it also mean that we miss that on mm_find_pmd()? I > meant a separate fix like this one: > > ---8<--- > diff --git a/mm/rmap.c b/mm/rmap.c > index 69416072b1a6..61309718640f 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -785,7 +785,7 @@ pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address) > * without holding anon_vma lock for write. So when looking for a > * genuine pmde (in which to find pte), test present and !THP together. > */ > - pmde = *pmd; > + pmde = pmd_read_atomic(pmd); > barrier(); > if (!pmd_present(pmde) || pmd_trans_huge(pmde)) > pmd = NULL; > ---8<--- > > As otherwise it seems it's also prone to PAE race conditions when reading > pmd out, but I could be missing something. > This is a good question. I took some time to look into this, but it's very complicated and unfortunately I couldn't reach a conclusion in the time I allotted to myself. My working (unverified) assumption is that mm_find_pmd() is called in places where it doesn't care if the pmd isn't read atomically. If so, does that also mean MADV_COLLAPSE is safe? I'm not sure. These i386 PAE + THP racing issues were most recently discussed when considering if READ_ONCE() should be used instead of pmd_read_atomic() [1]. [1] https://lore.kernel.org/linux-mm/594c1f0-d396-5346-1f36-606872cddb18@google.com/ > > + > > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > > + /* See comments in pmd_none_or_trans_huge_or_clear_bad() */ > > + barrier(); > > +#endif > > + if (!pmd_present(pmde)) > > + return SCAN_PMD_NULL; > > + if (pmd_trans_huge(pmde)) > > + return SCAN_PMD_MAPPED; > > Would it be safer to check pmd_bad()? I think not all mm pmd paths check > that, frankly I don't really know what's the major cause of a bad pmd > (either software bugs or corrupted mem), but just to check with you, > because potentially a bad pmd can be read as SCAN_SUCCEED and go through. > Likewise, I'm not sure what the cause of "bad pmds" is. Do you mean to check pmd_bad() instead of pmd_trans_huge()? I.e. b/c a pmd-mapped thp counts as "bad" (at least on x86 since PSE set) or do you mean to additionally check pmd_bad() after the pmd_trans_huge() check? If it's the former, I'd say we can't claim !pmd_bad() == memory already backed by thps / our job here is done. If it's the latter, I don't see it hurting much (but I can't argue intelligently about why it's needed) and can include the check in v6. Thanks again, Zach > > + return SCAN_SUCCEED; > > +} > > The rest looks good to me, thanks. > > -- > Peter Xu >