From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E87E3C433EF for ; Wed, 27 Apr 2022 00:26:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 506BC6B0073; Tue, 26 Apr 2022 20:26:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 48FB56B0075; Tue, 26 Apr 2022 20:26:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 292906B0078; Tue, 26 Apr 2022 20:26:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.26]) by kanga.kvack.org (Postfix) with ESMTP id 126D06B0073 for ; Tue, 26 Apr 2022 20:26:39 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E43CE27B75 for ; Wed, 27 Apr 2022 00:26:38 +0000 (UTC) X-FDA: 79400768076.10.C3ECAE9 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf10.hostedemail.com (Postfix) with ESMTP id F2941C0055 for ; Wed, 27 Apr 2022 00:26:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1651019197; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=ZH9uMOzaHHc/xUr1Eq+Qvw03dDIJDHv0d/C+yMi7pvM=; b=TnGS6pfBAg+ucl8jeHR9XXqHsvLXQJy7CVlZLSTZG0eiO5ysU1PgvSeCB4vfJGvHpflEAa KuwooUmj23TkV3XttBcZmr4TaTGZZZr7RXjbyuXjYWwQ7shn08rmEGfEF+P5WKPxeajNMo tRXqmO8Pmq6ALq6yQTWUfcBV9siOaD8= Received: from mail-io1-f69.google.com (mail-io1-f69.google.com [209.85.166.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-642-0R3dhMYdNm6SrprBzzt-JQ-1; Tue, 26 Apr 2022 20:26:36 -0400 X-MC-Unique: 0R3dhMYdNm6SrprBzzt-JQ-1 Received: by mail-io1-f69.google.com with SMTP id l132-20020a6b3e8a000000b00657a80b60fdso571691ioa.4 for ; Tue, 26 Apr 2022 17:26:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=ZH9uMOzaHHc/xUr1Eq+Qvw03dDIJDHv0d/C+yMi7pvM=; b=Xy6+HzN1nUnND8J/J6n3IAu7Oud4e5mUezcAOiQnZ/UCRcuD/Ysu3x7GA0uAjzecmj orosQ745NBQ0/ABm28+XL6qRj7wx9jf9UJzX4YpOITIcl14Fx4M7rKdGLElt3jfZtzkE /D+TF4qRQHa3sLsyhuL+En/seGaOxSUib9iu9zwsHBvONNXlvb2lmI2vyDgX+ukq3pBw fxzAGrUe4wnmxVatJkvwoJdZKmQwhK7Av9cqmLMDbRW0YpZZW8aRLW7FVIfHbg1WLd5F g4BE3zYdeagbICUtdjj6RdJmkaXpnWhVsProXN5I/q8RhALGEMx8nkXHvhwjyauMd8ob NnrA== X-Gm-Message-State: AOAM533202pB6i31GBNnF9M20vn5H0mKgLwugrICnaq2Oky5PB2xgD0k 5PiFMFTuhWMcqu5xRIM7UKP+Cb+ZL2ipPQ+CtSVa5pBEaUd/yfBE+Ycc8z9Ts/3HI1tRi7PM702 ntWEhD9OTyGA= X-Received: by 2002:a02:2acf:0:b0:328:75cd:8558 with SMTP id w198-20020a022acf000000b0032875cd8558mr10912593jaw.162.1651019196139; Tue, 26 Apr 2022 17:26:36 -0700 (PDT) X-Google-Smtp-Source: ABdhPJynDQuPSe8cFU2yf1DRje5nHn3iZdqxNwsWzStT2DQcxA81ufpXtPq3VW7ztZ+tTCypxpiU4w== X-Received: by 2002:a02:2acf:0:b0:328:75cd:8558 with SMTP id w198-20020a022acf000000b0032875cd8558mr10912572jaw.162.1651019195837; Tue, 26 Apr 2022 17:26:35 -0700 (PDT) Received: from xz-m1.local (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id q13-20020a056e020c2d00b002caa365b43bsm9040342ilg.76.2022.04.26.17.26.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Apr 2022 17:26:35 -0700 (PDT) Date: Tue, 26 Apr 2022 20:26:32 -0400 From: Peter Xu To: Zach O'Keefe Cc: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org, Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , kernel test robot Subject: Re: [PATCH v3 01/12] mm/khugepaged: record SCAN_PMD_MAPPED when scan_pmd() finds THP Message-ID: References: <20220426144412.742113-1-zokeefe@google.com> <20220426144412.742113-2-zokeefe@google.com> MIME-Version: 1.0 In-Reply-To: <20220426144412.742113-2-zokeefe@google.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Stat-Signature: 4uk8btn3okqf8f9rdarofch7ob65wfnk Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=TnGS6pfB; spf=none (imf10.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: F2941C0055 X-HE-Tag: 1651019189-917331 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi, Zach, On Tue, Apr 26, 2022 at 07:44:01AM -0700, Zach O'Keefe wrote: > When scanning an anon pmd to see if it's eligible for collapse, return > SCAN_PMD_MAPPED if the pmd already maps a THP. Note that > SCAN_PMD_MAPPED is different from SCAN_PAGE_COMPOUND used in the > file-collapse path, since the latter might identify pte-mapped compound > pages. This is required by MADV_COLLAPSE which necessarily needs to > know what hugepage-aligned/sized regions are already pmd-mapped. > > Signed-off-by: Zach O'Keefe > Reported-by: kernel test robot IIUC we don't need to attach this reported-by if this is not a bugfix. I think you can simply fix all issues reported by the test bot and only attach the line if the patch is fixing the problem that the bot was reporting explicitly. > --- > include/trace/events/huge_memory.h | 3 ++- > mm/internal.h | 1 + > mm/khugepaged.c | 30 ++++++++++++++++++++++++++---- > mm/rmap.c | 15 +++++++++++++-- > 4 files changed, 42 insertions(+), 7 deletions(-) > > diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h > index d651f3437367..9faa678e0a5b 100644 > --- a/include/trace/events/huge_memory.h > +++ b/include/trace/events/huge_memory.h > @@ -33,7 +33,8 @@ > EM( SCAN_ALLOC_HUGE_PAGE_FAIL, "alloc_huge_page_failed") \ > EM( SCAN_CGROUP_CHARGE_FAIL, "ccgroup_charge_failed") \ > EM( SCAN_TRUNCATED, "truncated") \ > - EMe(SCAN_PAGE_HAS_PRIVATE, "page_has_private") \ > + EM( SCAN_PAGE_HAS_PRIVATE, "page_has_private") \ > + EMe(SCAN_PMD_MAPPED, "page_pmd_mapped") \ Nit: IMHO it can be put even in the middle so we don't need to touch the EMe() every time. :) Apart from that, it does sound proper to me to put SCAN_PMD_MAPPED to be right after SCAN_PMD_NULL anyway. > > #undef EM > #undef EMe > diff --git a/mm/internal.h b/mm/internal.h > index 0667abd57634..51ae9f71a2a3 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -172,6 +172,7 @@ extern void reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason > /* > * in mm/rmap.c: > */ > +pmd_t *mm_find_pmd_raw(struct mm_struct *mm, unsigned long address); > extern pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address); > > /* > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index ba8dbd1825da..2933b13fc975 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -51,6 +51,7 @@ enum scan_result { > SCAN_CGROUP_CHARGE_FAIL, > SCAN_TRUNCATED, > SCAN_PAGE_HAS_PRIVATE, > + SCAN_PMD_MAPPED, > }; > > #define CREATE_TRACE_POINTS > @@ -987,6 +988,29 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address, > return 0; > } > > +static int find_pmd_or_thp_or_none(struct mm_struct *mm, > + unsigned long address, > + pmd_t **pmd) > +{ > + pmd_t pmde; > + > + *pmd = mm_find_pmd_raw(mm, address); > + if (!*pmd) > + return SCAN_PMD_NULL; > + > + pmde = pmd_read_atomic(*pmd); > + > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > + /* See comments in pmd_none_or_trans_huge_or_clear_bad() */ > + barrier(); > +#endif > + if (!pmd_present(pmde) || pmd_none(pmde)) Could we drop the pmd_none() check? I assume !pmd_present() should have covered that case already? > + return SCAN_PMD_NULL; > + if (pmd_trans_huge(pmde)) > + return SCAN_PMD_MAPPED; > + return SCAN_SUCCEED; > +} > + > /* > * Bring missing pages in from swap, to complete THP collapse. > * Only done if khugepaged_scan_pmd believes it is worthwhile. > @@ -1238,11 +1262,9 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, > > VM_BUG_ON(address & ~HPAGE_PMD_MASK); > > - pmd = mm_find_pmd(mm, address); > - if (!pmd) { > - result = SCAN_PMD_NULL; > + result = find_pmd_or_thp_or_none(mm, address, &pmd); > + if (result != SCAN_SUCCEED) > goto out; > - } > > memset(khugepaged_node_load, 0, sizeof(khugepaged_node_load)); > pte = pte_offset_map_lock(mm, pmd, address, &ptl); > diff --git a/mm/rmap.c b/mm/rmap.c > index 61e63db5dc6f..49817f35e65c 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -759,13 +759,12 @@ unsigned long page_address_in_vma(struct page *page, struct vm_area_struct *vma) > return vma_address(page, vma); > } > > -pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address) > +pmd_t *mm_find_pmd_raw(struct mm_struct *mm, unsigned long address) > { > pgd_t *pgd; > p4d_t *p4d; > pud_t *pud; > pmd_t *pmd = NULL; > - pmd_t pmde; > > pgd = pgd_offset(mm, address); > if (!pgd_present(*pgd)) > @@ -780,6 +779,18 @@ pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address) > goto out; > > pmd = pmd_offset(pud, address); > +out: > + return pmd; > +} > + > +pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address) > +{ > + pmd_t pmde; > + pmd_t *pmd; > + > + pmd = mm_find_pmd_raw(mm, address); > + if (!pmd) > + goto out; > /* > * Some THP functions use the sequence pmdp_huge_clear_flush(), set_pmd_at() > * without holding anon_vma lock for write. So when looking for a > -- > 2.36.0.rc2.479.g8af0fa9b8e-goog > -- Peter Xu