From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79E31C2D0E5 for ; Fri, 27 Mar 2020 21:32:10 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2DDCD206F6 for ; Fri, 27 Mar 2020 21:32:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="dT9KaL6H" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2DDCD206F6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BFFC76B0036; Fri, 27 Mar 2020 17:32:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BB0606B006E; Fri, 27 Mar 2020 17:32:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AC7436B0078; Fri, 27 Mar 2020 17:32:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0080.hostedemail.com [216.40.44.80]) by kanga.kvack.org (Postfix) with ESMTP id 926196B0036 for ; Fri, 27 Mar 2020 17:32:09 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 381368248068 for ; Fri, 27 Mar 2020 21:32:09 +0000 (UTC) X-FDA: 76642440378.18.alley32_6a114a94aa746 X-HE-Tag: alley32_6a114a94aa746 X-Filterd-Recvd-Size: 7300 Received: from mail-ed1-f67.google.com (mail-ed1-f67.google.com [209.85.208.67]) by imf50.hostedemail.com (Postfix) with ESMTP for ; Fri, 27 Mar 2020 21:32:08 +0000 (UTC) Received: by mail-ed1-f67.google.com with SMTP id cf14so13010807edb.13 for ; Fri, 27 Mar 2020 14:32:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=d/BcDy55M6k3RsToyaQtP26qvola5RUqyCk9DPFq0TY=; b=dT9KaL6H+W6iFJ4UK/UE1Ag6+7J+nTZ56mmSDmdPsBjK2xLeysBUgDFdPJ2GEJmFXF 5cZYPpL1/6d5OmpHh+GdgM8MEbTnuRT1F48G20Dkr3ARIWb/nsNTbZbsAbYy2aI1sTEr 0+dYI7gMBDdb8M78ERSECrXwZaUue6G9Xbh62yqtCGeu8ZwTxi6vGSNValRRWGxHxPxv S0qJmPHgtVEHQiVnAUWGQK35vMwYrPcHDANLob/6OpglYROKZ8A5CeKTS0HiGRP/0B3B oQAPX4W+xBHCJaVfCpp3+XIH5hncUjOaggfZeSAwfZJWnCVRfI8XpGgXWRK6ooDGzpCR dYmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=d/BcDy55M6k3RsToyaQtP26qvola5RUqyCk9DPFq0TY=; b=dDLm3P/DhX5GMjIHk13sno8VLVWW2Wbhyn9L8zo+j76vtL+INzyAPGQOyZOlpc8s+E RuSH3ZaUl7uDBv3FHP/ccB7Z5dJFsixJWXvLMZAia0SCJq7qTuEX0Q7xxzGTNmVTMFie nWuaUMAQH6djYhklXKE+qe3KMz5u3MQaecxHw/udqaiJX2e8gC1OMrR+Dem+ROkcz687 vUe8tb0WbyunQrbVvAtaQ0UGc7KNaQOn6flkOsUFzVkGnuIgHlEjL+ln699+vATrGjBs BQh6GGvmfTRJbU6IYRMTx7Mf0ibRdYlpYVJ5+qapxZf3v8at1WOz97zvYO6IGZM3graN n0iw== X-Gm-Message-State: ANhLgQ2I7u9K09vgNDStq2+gOmUWf7JWhRAWv9U7UZepZ/QKrFsgUhkS NH3ymZpFWEoF+BuDLVKvgXCMhAGBYLvELsl8924= X-Google-Smtp-Source: ADFU+vsJJJGRfxvQ8W0NpLPVRfEQ2EuvR2uA7o9ESLPIjR4d7+tl9r1LSz1bqDCwKkIkPeYq7/CqWEGa7o1yG/sc2XA= X-Received: by 2002:a50:9f6e:: with SMTP id b101mr1197996edf.372.1585344727711; Fri, 27 Mar 2020 14:32:07 -0700 (PDT) MIME-Version: 1.0 References: <20200327170601.18563-1-kirill.shutemov@linux.intel.com> <20200327170601.18563-5-kirill.shutemov@linux.intel.com> In-Reply-To: From: Yang Shi Date: Fri, 27 Mar 2020 14:31:55 -0700 Message-ID: Subject: Re: [PATCH 4/7] khugepaged: Allow to callapse a page shared across fork To: Zi Yan Cc: "Kirill A. Shutemov" , Andrew Morton , Andrea Arcangeli , Linux MM , Linux Kernel Mailing List , "Kirill A. Shutemov" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Mar 27, 2020 at 11:20 AM Zi Yan wrote: > > On 27 Mar 2020, at 13:05, Kirill A. Shutemov wrote: > > > The page can be included into collapse as long as it doesn't have extra > > pins (from GUP or otherwise). > > > > Signed-off-by: Kirill A. Shutemov > > --- > > mm/khugepaged.c | 28 ++++++++++++++++------------ > > 1 file changed, 16 insertions(+), 12 deletions(-) > > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > > index 39e0994abeb8..b47edfe57f7b 100644 > > --- a/mm/khugepaged.c > > +++ b/mm/khugepaged.c > > @@ -581,18 +581,26 @@ static int __collapse_huge_page_isolate(struct vm= _area_struct *vma, > > } > > > > /* > > - * cannot use mapcount: can't collapse if there's a gup p= in. > > - * The page must only be referenced by the scanned proces= s > > - * and page swap cache. > > + * Check if the page has any GUP (or other external) pins= . > > + * > > + * The page table that maps the page has been already unl= inked > > + * from the page table tree and this process cannot get > > + * additinal pin on the page. > > + * > > + * New pins can come later if the page is shared across f= ork, > > + * but not for the this process. It is fine. The other pr= ocess > > + * cannot write to the page, only trigger CoW. > > */ > > - if (page_count(page) !=3D 1 + PageSwapCache(page)) { > > + if (total_mapcount(page) + PageSwapCache(page) !=3D > > + page_count(page)) { > > Do you think having a function for this check would be better? Since the = check is used three times. > > > /* > > * Drain pagevec and retry just in case we can ge= t rid > > * of the extra pin, like in swapin case. > > */ > > lru_add_drain(); > > } > > - if (page_count(page) !=3D 1 + PageSwapCache(page)) { > > + if (total_mapcount(page) + PageSwapCache(page) !=3D > > + page_count(page)) { > > unlock_page(page); > > result =3D SCAN_PAGE_COUNT; > > goto out; > > @@ -680,7 +688,6 @@ static void __collapse_huge_page_copy(pte_t *pte, s= truct page *page, > > } else { > > src_page =3D pte_page(pteval); > > copy_user_highpage(page, src_page, address, vma); > > - VM_BUG_ON_PAGE(page_mapcount(src_page) !=3D 1, sr= c_page); > > Maybe replace it with this? > > VM_BUG_ON_PAGE(page_mapcount(src_page) + PageSwapCache(src_page) !=3D pag= e_count(src_page), src_page); I don't think this is correct either. If a THP is PTE mapped its refcount would be bumped by the number of PTE mapped subpages. But page_mapcount() would just return the mapcount of that specific subpage. So, total_mapcount() should be used, but the same check has been done before reaching here. > > > > release_pte_page(src_page); > > /* > > * ptl mostly unnecessary, but preempt has to > > @@ -1209,12 +1216,9 @@ static int khugepaged_scan_pmd(struct mm_struct = *mm, > > goto out_unmap; > > } > > > > - /* > > - * cannot use mapcount: can't collapse if there's a gup p= in. > > - * The page must only be referenced by the scanned proces= s > > - * and page swap cache. > > - */ > > - if (page_count(page) !=3D 1 + PageSwapCache(page)) { > > + /* Check if the page has any GUP (or other external) pins= */ > > + if (total_mapcount(page) + PageSwapCache(page) !=3D > > + page_count(page)) { > > result =3D SCAN_PAGE_COUNT; > > goto out_unmap; > > } > > -- > > 2.26.0 > > Thanks. > > =E2=80=94 > Best Regards, > Yan Zi