From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29611C47080 for ; Wed, 2 Jun 2021 01:05:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D21FC613C0 for ; Wed, 2 Jun 2021 01:05:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D21FC613C0 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4F2806B006C; Tue, 1 Jun 2021 21:05:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 47C016B006E; Tue, 1 Jun 2021 21:05:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2CEB46B0070; Tue, 1 Jun 2021 21:05:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0073.hostedemail.com [216.40.44.73]) by kanga.kvack.org (Postfix) with ESMTP id E519B6B006C for ; Tue, 1 Jun 2021 21:05:47 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 783608249980 for ; Wed, 2 Jun 2021 01:05:47 +0000 (UTC) X-FDA: 78206991534.27.E922C7D Received: from mail-pg1-f179.google.com (mail-pg1-f179.google.com [209.85.215.179]) by imf05.hostedemail.com (Postfix) with ESMTP id 449F8E000251 for ; Wed, 2 Jun 2021 01:05:30 +0000 (UTC) Received: by mail-pg1-f179.google.com with SMTP id 29so835935pgu.11 for ; Tue, 01 Jun 2021 18:05:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Dj6Ypj9Mh6dX5Wwhjl2tsupXxG/c9V6GNw07cEPy0U8=; b=dRIr40D6m1tlJWtVPwKqj7evZl3eVyjnAnOzE5dZIhTh+4BagCSHPpufnzhTb6HISc raPXYfMgpB73k46S6rHEgIBeCpLLF7InlXgtObQ9Rm0tev7aG90w8oc7LI+TH04VhR/3 xbOF39OlUvPFxHOD5djsM/gF1Sx0H+2+TtzQGittUvFInMWwh0WeEE7t+CQIR6A9MPHp 6GG6nIvY9ioWNXWML4Yn1HW3INmi2NKQFH4bTb6T2kaAg1wmeedwmBvqsTOe1G7EEIQ0 vJhW5MJEUiEwipHmKPixm2WVojAXrYfD+3pT8rQHuruzeIoKOyGe2nOwE4AwySd44yMT 2CxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Dj6Ypj9Mh6dX5Wwhjl2tsupXxG/c9V6GNw07cEPy0U8=; b=OHM7GKadvIUkkWaFpw2oNbF0H3bGgsgsKf1PgMS7l9fH88s0Cu3zCyC5I7I0tlhrY8 mBwPeeA+LU3wFodtsS4vNxr3ZZSe1GBjm8CJlg4OTqp5B4fmpXHWFmFecy114HmyC+Li huL02zPzXuyVcO+qsww1PGeof8eVHPF8GyIslJdwB8QSTIhYRew//m3qIvnaa1XsL8g/ x5q3lEgvtqS4zrE5x8IlIHISb884TJDsds6k7uUcUSlUbbv+0vRk07/1WNaqFfjHyXSQ 3gGTr8SAHpYoG4hYOfHTMpT8aImir0Bhi70xbic2zrSmnAkxxs0Vc3mLjhObhoP5cZVz 7LPA== X-Gm-Message-State: AOAM533F/kl9b7GUWzqWizv1mJS2DJDK88VnbYNECV+8tdsI8baOlYzh QacMdWp5m7lGAtt4knxCutMI/3uxzVt9K3bXC7LmJA== X-Google-Smtp-Source: ABdhPJxiaFfAVthzNGmqLVy7QrYJPcX1QwPYXM1zy5gj+Gc8CtbANGyGEG+JnJ/T92mBn6JK97k2x0ttABA1Whokkr8= X-Received: by 2002:a63:4653:: with SMTP id v19mr8717030pgk.240.1622595945932; Tue, 01 Jun 2021 18:05:45 -0700 (PDT) MIME-Version: 1.0 References: <20210325230938.30752-1-joao.m.martins@oracle.com> <20210325230938.30752-12-joao.m.martins@oracle.com> In-Reply-To: <20210325230938.30752-12-joao.m.martins@oracle.com> From: Dan Williams Date: Tue, 1 Jun 2021 18:05:34 -0700 Message-ID: Subject: Re: [PATCH v1 11/11] mm/gup: grab head page refcount once for group of subpages To: Joao Martins Cc: Linux MM , Ira Weiny , Matthew Wilcox , Jason Gunthorpe , Jane Chu , Muchun Song , Mike Kravetz , Andrew Morton , nvdimm@lists.linux.dev Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 449F8E000251 Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=intel-com.20150623.gappssmtp.com header.s=20150623 header.b=dRIr40D6; dmarc=fail reason="No valid SPF, DKIM not aligned (relaxed)" header.from=intel.com (policy=none); spf=none (imf05.hostedemail.com: domain of dan.j.williams@intel.com has no SPF policy when checking 209.85.215.179) smtp.mailfrom=dan.j.williams@intel.com X-Rspamd-Server: rspam03 X-Stat-Signature: 8mqwkwopdapns4x7jrrdj3fdohmi9i96 X-HE-Tag: 1622595930-527569 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Mar 25, 2021 at 4:10 PM Joao Martins wrote: > > Much like hugetlbfs or THPs, treat device pagemaps with > compound pages like the rest of GUP handling of compound pages. > How about: "Use try_grab_compound_head() for device-dax GUP when configured with a compound pagemap." > Rather than incrementing the refcount every 4K, we record > all sub pages and increment by @refs amount *once*. "Rather than incrementing the refcount for each page, do one atomic addition for all the pages to be pinned." > > Performance measured by gup_benchmark improves considerably > get_user_pages_fast() and pin_user_pages_fast() with NVDIMMs: > > $ gup_test -f /dev/dax1.0 -m 16384 -r 10 -S [-u,-a] -n 512 -w > (get_user_pages_fast 2M pages) ~59 ms -> ~6.1 ms > (pin_user_pages_fast 2M pages) ~87 ms -> ~6.2 ms > [altmap] > (get_user_pages_fast 2M pages) ~494 ms -> ~9 ms > (pin_user_pages_fast 2M pages) ~494 ms -> ~10 ms Hmm what is altmap representing here? The altmap case does not support compound geometry, so this last test is comparing pinning this amount of memory without compound pages where the memmap is in PMEM to the speed *with* compound pages and the memmap in DRAM? > > $ gup_test -f /dev/dax1.0 -m 129022 -r 10 -S [-u,-a] -n 512 -w > (get_user_pages_fast 2M pages) ~492 ms -> ~49 ms > (pin_user_pages_fast 2M pages) ~493 ms -> ~50 ms > [altmap with -m 127004] > (get_user_pages_fast 2M pages) ~3.91 sec -> ~70 ms > (pin_user_pages_fast 2M pages) ~3.97 sec -> ~74 ms > > Signed-off-by: Joao Martins > --- > mm/gup.c | 52 ++++++++++++++++++++++++++++++++-------------------- > 1 file changed, 32 insertions(+), 20 deletions(-) > > diff --git a/mm/gup.c b/mm/gup.c > index b3e647c8b7ee..514f12157a0f 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -2159,31 +2159,54 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, > } > #endif /* CONFIG_ARCH_HAS_PTE_SPECIAL */ > > + > +static int record_subpages(struct page *page, unsigned long addr, > + unsigned long end, struct page **pages) > +{ > + int nr; > + > + for (nr = 0; addr != end; addr += PAGE_SIZE) > + pages[nr++] = page++; > + > + return nr; > +} > + > #if defined(CONFIG_ARCH_HAS_PTE_DEVMAP) && defined(CONFIG_TRANSPARENT_HUGEPAGE) > static int __gup_device_huge(unsigned long pfn, unsigned long addr, > unsigned long end, unsigned int flags, > struct page **pages, int *nr) > { > - int nr_start = *nr; > + int refs, nr_start = *nr; > struct dev_pagemap *pgmap = NULL; > > do { > - struct page *page = pfn_to_page(pfn); > + struct page *head, *page = pfn_to_page(pfn); > + unsigned long next; > > pgmap = get_dev_pagemap(pfn, pgmap); > if (unlikely(!pgmap)) { > undo_dev_pagemap(nr, nr_start, flags, pages); > return 0; > } > - SetPageReferenced(page); > - pages[*nr] = page; > - if (unlikely(!try_grab_page(page, flags))) { > - undo_dev_pagemap(nr, nr_start, flags, pages); > + > + head = compound_head(page); > + next = PageCompound(head) ? end : addr + PAGE_SIZE; This looks a tad messy, and makes assumptions that upper layers are not sending this routine multiple huge pages to map. next should be set to the next compound page, not end. > + refs = record_subpages(page, addr, next, pages + *nr); > + > + SetPageReferenced(head); > + head = try_grab_compound_head(head, refs, flags); > + if (!head) { > + if (PageCompound(head)) { @head is NULL here, I think you wanted to rename the result of try_grab_compound_head() to something like pinned_head so that you don't undo the work you did above. However I feel like there's one too PageCompund() checks. > + ClearPageReferenced(head); > + put_dev_pagemap(pgmap); > + } else { > + undo_dev_pagemap(nr, nr_start, flags, pages); > + } > return 0; > } > - (*nr)++; > - pfn++; > - } while (addr += PAGE_SIZE, addr != end); > + *nr += refs; > + pfn += refs; > + } while (addr += (refs << PAGE_SHIFT), addr != end); > > if (pgmap) > put_dev_pagemap(pgmap); > @@ -2243,17 +2266,6 @@ static int __gup_device_huge_pud(pud_t pud, pud_t *pudp, unsigned long addr, > } > #endif > > -static int record_subpages(struct page *page, unsigned long addr, > - unsigned long end, struct page **pages) > -{ > - int nr; > - > - for (nr = 0; addr != end; addr += PAGE_SIZE) > - pages[nr++] = page++; > - > - return nr; > -} > - > #ifdef CONFIG_ARCH_HAS_HUGEPD > static unsigned long hugepte_addr_end(unsigned long addr, unsigned long end, > unsigned long sz) > -- > 2.17.1 >