From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=3.0 tests=INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 781D3C433DF for ; Tue, 9 Jun 2020 13:24:45 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 00CF6206A4 for ; Tue, 9 Jun 2020 13:24:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 00CF6206A4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4D3016B0002; Tue, 9 Jun 2020 09:24:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 48D4E6B0005; Tue, 9 Jun 2020 09:24:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3701D6B0006; Tue, 9 Jun 2020 09:24:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0071.hostedemail.com [216.40.44.71]) by kanga.kvack.org (Postfix) with ESMTP id 1CF816B0002 for ; Tue, 9 Jun 2020 09:24:44 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id CD91B181CE66C for ; Tue, 9 Jun 2020 13:24:43 +0000 (UTC) X-FDA: 76909743246.04.bee56_62083e226dc2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin04.hostedemail.com (Postfix) with ESMTP id ADE27802A07F for ; Tue, 9 Jun 2020 13:24:43 +0000 (UTC) X-HE-Tag: bee56_62083e226dc2 X-Filterd-Recvd-Size: 15225 Received: from mail-ej1-f68.google.com (mail-ej1-f68.google.com [209.85.218.68]) by imf05.hostedemail.com (Postfix) with ESMTP for ; Tue, 9 Jun 2020 13:24:43 +0000 (UTC) Received: by mail-ej1-f68.google.com with SMTP id mb16so22365831ejb.4 for ; Tue, 09 Jun 2020 06:24:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=iw+D4dvLrQpmS+NJGSormnt3tYAv6rvubtYydpL2rqo=; b=Gp9gvoNFHDvXCXMa1IWgQuEUBQDEXAjkZUkCLblyqM2TDhsHjUUZ/w+7BP+YrQO0Tb JlKkk23bhAtPPKJWEiQK9t/ur535vTMOtbFmjUqeZl6LeXRgY0qcpD2BHCmwD97xsHO7 v7OUtyz9I+CtrvLfI2yySQ/r97R0Pf5KrbIXh6HOM5dYNaGDuuLHNo4mhd15zK+kZgYQ NbgBgJrj2Q+AMdoxmt8gbt8FasgBbXQBTuOezoBv7Uj/nTF7xb9DB9viYjpwsswZZOGl aPXdUXwZ5txJXA81me8ojE3pln7Nok/FGPZnuH0PM85CRMh21420rtNmxQUc+kkCiQRU B8yg== X-Gm-Message-State: AOAM5300+FSpVg976vZEllNXCFQIFNWeFj43CEcHT+AyemCZiJa/1ray KCDXUYby+5hIfbpLDoj87+U= X-Google-Smtp-Source: ABdhPJwYWHwI3EVvYy41zScUrLx2EkKMg9bbNXBJmHx0PC/qTkCb7hpUyUFoBT5CI/S1qpoS6sPx6A== X-Received: by 2002:a17:906:7807:: with SMTP id u7mr18858624ejm.426.1591709081931; Tue, 09 Jun 2020 06:24:41 -0700 (PDT) Received: from localhost (ip-37-188-174-195.eurotel.cz. [37.188.174.195]) by smtp.gmail.com with ESMTPSA id cx13sm15164114edb.20.2020.06.09.06.24.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 09 Jun 2020 06:24:40 -0700 (PDT) Date: Tue, 9 Jun 2020 15:24:38 +0200 From: Michal Hocko To: js1304@gmail.com Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@lge.com, Vlastimil Babka , Christoph Hellwig , Roman Gushchin , Mike Kravetz , Naoya Horiguchi , Joonsoo Kim Subject: Re: [PATCH v2 03/12] mm/hugetlb: introduce alloc_control structure to simplify migration target allocation APIs Message-ID: <20200609132438.GE22623@dhcp22.suse.cz> References: <1590561903-13186-1-git-send-email-iamjoonsoo.kim@lge.com> <1590561903-13186-4-git-send-email-iamjoonsoo.kim@lge.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1590561903-13186-4-git-send-email-iamjoonsoo.kim@lge.com> X-Rspamd-Queue-Id: ADE27802A07F X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed 27-05-20 15:44:54, Joonsoo Kim wrote: > From: Joonsoo Kim > > Currently, page allocation functions for migration requires some arguments. > More worse, in the following patch, more argument will be needed to unify > the similar functions. To simplify them, in this patch, unified data > structure that controls allocation behaviour is introduced. > > For clean-up, function declarations are re-ordered. This is really hard to review without having a clear picture of the resulting code so bear with me. I can see some reasons why allocation callbacks might benefit from a agragated argument but you seem to touch the internal hugetlb dequeue_huge_page_vma which shouldn't really need that. I wouldn't mind much but I remember the hugetlb allocation functions layering is quite complex for hugetlb specific reasons (see 0c397daea1d4 ("mm, hugetlb: further simplify hugetlb allocation API") for more background). Is there any reason why the agregated argument cannot be limited only to migration callbacks. That would be alloc_huge_page_node, alloc_huge_page_nodemask and alloc_huge_page_vma. > Signed-off-by: Joonsoo Kim > --- > include/linux/hugetlb.h | 35 +++++++++++++++------------- > mm/gup.c | 11 ++++++--- > mm/hugetlb.c | 62 ++++++++++++++++++++++++------------------------- > mm/internal.h | 7 ++++++ > mm/mempolicy.c | 13 +++++++---- > mm/migrate.c | 13 +++++++---- > 6 files changed, 83 insertions(+), 58 deletions(-) > > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index 50650d0..15c8fb8 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -14,6 +14,7 @@ > struct ctl_table; > struct user_struct; > struct mmu_gather; > +struct alloc_control; > > #ifndef is_hugepd > typedef struct { unsigned long pd; } hugepd_t; > @@ -502,15 +503,16 @@ struct huge_bootmem_page { > struct hstate *hstate; > }; > > -struct page *alloc_huge_page(struct vm_area_struct *vma, > - unsigned long addr, int avoid_reserve); > -struct page *alloc_huge_page_node(struct hstate *h, int nid); > -struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid, > - nodemask_t *nmask); > +struct page *alloc_migrate_huge_page(struct hstate *h, > + struct alloc_control *ac); > +struct page *alloc_huge_page_node(struct hstate *h, > + struct alloc_control *ac); > +struct page *alloc_huge_page_nodemask(struct hstate *h, > + struct alloc_control *ac); > struct page *alloc_huge_page_vma(struct hstate *h, struct vm_area_struct *vma, > unsigned long address); > -struct page *alloc_migrate_huge_page(struct hstate *h, gfp_t gfp_mask, > - int nid, nodemask_t *nmask); > +struct page *alloc_huge_page(struct vm_area_struct *vma, > + unsigned long addr, int avoid_reserve); > int huge_add_to_page_cache(struct page *page, struct address_space *mapping, > pgoff_t idx); > > @@ -752,20 +754,14 @@ static inline void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, > #else /* CONFIG_HUGETLB_PAGE */ > struct hstate {}; > > -static inline struct page *alloc_huge_page(struct vm_area_struct *vma, > - unsigned long addr, > - int avoid_reserve) > -{ > - return NULL; > -} > - > -static inline struct page *alloc_huge_page_node(struct hstate *h, int nid) > +static inline struct page * > +alloc_huge_page_node(struct hstate *h, struct alloc_control *ac) > { > return NULL; > } > > static inline struct page * > -alloc_huge_page_nodemask(struct hstate *h, int preferred_nid, nodemask_t *nmask) > +alloc_huge_page_nodemask(struct hstate *h, struct alloc_control *ac) > { > return NULL; > } > @@ -777,6 +773,13 @@ static inline struct page *alloc_huge_page_vma(struct hstate *h, > return NULL; > } > > +static inline struct page *alloc_huge_page(struct vm_area_struct *vma, > + unsigned long addr, > + int avoid_reserve) > +{ > + return NULL; > +} > + > static inline int __alloc_bootmem_huge_page(struct hstate *h) > { > return 0; > diff --git a/mm/gup.c b/mm/gup.c > index ee039d4..6b78f11 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -1612,16 +1612,21 @@ static struct page *new_non_cma_page(struct page *page, unsigned long private) > if (PageHighMem(page)) > gfp_mask |= __GFP_HIGHMEM; > > -#ifdef CONFIG_HUGETLB_PAGE > if (PageHuge(page)) { > struct hstate *h = page_hstate(page); > + struct alloc_control ac = { > + .nid = nid, > + .nmask = NULL, > + .gfp_mask = gfp_mask, > + }; > + > /* > * We don't want to dequeue from the pool because pool pages will > * mostly be from the CMA region. > */ > - return alloc_migrate_huge_page(h, gfp_mask, nid, NULL); > + return alloc_migrate_huge_page(h, &ac); > } > -#endif > + > if (PageTransHuge(page)) { > struct page *thp; > /* > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 57ece74..453ba94 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1053,8 +1053,8 @@ static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid) > return page; > } > > -static struct page *dequeue_huge_page_nodemask(struct hstate *h, gfp_t gfp_mask, int nid, > - nodemask_t *nmask) > +static struct page *dequeue_huge_page_nodemask(struct hstate *h, > + struct alloc_control *ac) > { > unsigned int cpuset_mems_cookie; > struct zonelist *zonelist; > @@ -1062,14 +1062,15 @@ static struct page *dequeue_huge_page_nodemask(struct hstate *h, gfp_t gfp_mask, > struct zoneref *z; > int node = NUMA_NO_NODE; > > - zonelist = node_zonelist(nid, gfp_mask); > + zonelist = node_zonelist(ac->nid, ac->gfp_mask); > > retry_cpuset: > cpuset_mems_cookie = read_mems_allowed_begin(); > - for_each_zone_zonelist_nodemask(zone, z, zonelist, gfp_zone(gfp_mask), nmask) { > + for_each_zone_zonelist_nodemask(zone, z, zonelist, > + gfp_zone(ac->gfp_mask), ac->nmask) { > struct page *page; > > - if (!cpuset_zone_allowed(zone, gfp_mask)) > + if (!cpuset_zone_allowed(zone, ac->gfp_mask)) > continue; > /* > * no need to ask again on the same node. Pool is node rather than > @@ -1105,9 +1106,7 @@ static struct page *dequeue_huge_page_vma(struct hstate *h, > { > struct page *page; > struct mempolicy *mpol; > - gfp_t gfp_mask; > - nodemask_t *nodemask; > - int nid; > + struct alloc_control ac = {0}; > > /* > * A child process with MAP_PRIVATE mappings created by their parent > @@ -1122,9 +1121,10 @@ static struct page *dequeue_huge_page_vma(struct hstate *h, > if (avoid_reserve && h->free_huge_pages - h->resv_huge_pages == 0) > goto err; > > - gfp_mask = htlb_alloc_mask(h); > - nid = huge_node(vma, address, gfp_mask, &mpol, &nodemask); > - page = dequeue_huge_page_nodemask(h, gfp_mask, nid, nodemask); > + ac.gfp_mask = htlb_alloc_mask(h); > + ac.nid = huge_node(vma, address, ac.gfp_mask, &mpol, &ac.nmask); > + > + page = dequeue_huge_page_nodemask(h, &ac); > if (page && !avoid_reserve && vma_has_reserves(vma, chg)) { > SetPagePrivate(page); > h->resv_huge_pages--; > @@ -1937,15 +1937,16 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask, > return page; > } > > -struct page *alloc_migrate_huge_page(struct hstate *h, gfp_t gfp_mask, > - int nid, nodemask_t *nmask) > +struct page *alloc_migrate_huge_page(struct hstate *h, > + struct alloc_control *ac) > { > struct page *page; > > if (hstate_is_gigantic(h)) > return NULL; > > - page = alloc_fresh_huge_page(h, gfp_mask, nid, nmask, NULL); > + page = alloc_fresh_huge_page(h, ac->gfp_mask, > + ac->nid, ac->nmask, NULL); > if (!page) > return NULL; > > @@ -1979,36 +1980,37 @@ struct page *alloc_buddy_huge_page_with_mpol(struct hstate *h, > } > > /* page migration callback function */ > -struct page *alloc_huge_page_node(struct hstate *h, int nid) > +struct page *alloc_huge_page_node(struct hstate *h, > + struct alloc_control *ac) > { > - gfp_t gfp_mask = htlb_alloc_mask(h); > struct page *page = NULL; > > - if (nid != NUMA_NO_NODE) > - gfp_mask |= __GFP_THISNODE; > + ac->gfp_mask = htlb_alloc_mask(h); > + if (ac->nid != NUMA_NO_NODE) > + ac->gfp_mask |= __GFP_THISNODE; > > spin_lock(&hugetlb_lock); > if (h->free_huge_pages - h->resv_huge_pages > 0) > - page = dequeue_huge_page_nodemask(h, gfp_mask, nid, NULL); > + page = dequeue_huge_page_nodemask(h, ac); > spin_unlock(&hugetlb_lock); > > if (!page) > - page = alloc_migrate_huge_page(h, gfp_mask, nid, NULL); > + page = alloc_migrate_huge_page(h, ac); > > return page; > } > > /* page migration callback function */ > -struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid, > - nodemask_t *nmask) > +struct page *alloc_huge_page_nodemask(struct hstate *h, > + struct alloc_control *ac) > { > - gfp_t gfp_mask = htlb_alloc_mask(h); > + ac->gfp_mask = htlb_alloc_mask(h); > > spin_lock(&hugetlb_lock); > if (h->free_huge_pages - h->resv_huge_pages > 0) { > struct page *page; > > - page = dequeue_huge_page_nodemask(h, gfp_mask, preferred_nid, nmask); > + page = dequeue_huge_page_nodemask(h, ac); > if (page) { > spin_unlock(&hugetlb_lock); > return page; > @@ -2016,22 +2018,20 @@ struct page *alloc_huge_page_nodemask(struct hstate *h, int preferred_nid, > } > spin_unlock(&hugetlb_lock); > > - return alloc_migrate_huge_page(h, gfp_mask, preferred_nid, nmask); > + return alloc_migrate_huge_page(h, ac); > } > > /* mempolicy aware migration callback */ > struct page *alloc_huge_page_vma(struct hstate *h, struct vm_area_struct *vma, > unsigned long address) > { > + struct alloc_control ac = {0}; > struct mempolicy *mpol; > - nodemask_t *nodemask; > struct page *page; > - gfp_t gfp_mask; > - int node; > > - gfp_mask = htlb_alloc_mask(h); > - node = huge_node(vma, address, gfp_mask, &mpol, &nodemask); > - page = alloc_huge_page_nodemask(h, node, nodemask); > + ac.gfp_mask = htlb_alloc_mask(h); > + ac.nid = huge_node(vma, address, ac.gfp_mask, &mpol, &ac.nmask); > + page = alloc_huge_page_nodemask(h, &ac); > mpol_cond_put(mpol); > > return page; > diff --git a/mm/internal.h b/mm/internal.h > index 9886db2..6e613ce 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -613,4 +613,11 @@ static inline bool is_migrate_highatomic_page(struct page *page) > > void setup_zone_pageset(struct zone *zone); > extern struct page *alloc_new_node_page(struct page *page, unsigned long node); > + > +struct alloc_control { > + int nid; /* preferred node id */ > + nodemask_t *nmask; > + gfp_t gfp_mask; > +}; > + > #endif /* __MM_INTERNAL_H */ > diff --git a/mm/mempolicy.c b/mm/mempolicy.c > index 3813206..3b6b551 100644 > --- a/mm/mempolicy.c > +++ b/mm/mempolicy.c > @@ -1068,10 +1068,15 @@ static int migrate_page_add(struct page *page, struct list_head *pagelist, > /* page allocation callback for NUMA node migration */ > struct page *alloc_new_node_page(struct page *page, unsigned long node) > { > - if (PageHuge(page)) > - return alloc_huge_page_node(page_hstate(compound_head(page)), > - node); > - else if (PageTransHuge(page)) { > + if (PageHuge(page)) { > + struct hstate *h = page_hstate(compound_head(page)); > + struct alloc_control ac = { > + .nid = node, > + .nmask = NULL, > + }; > + > + return alloc_huge_page_node(h, &ac); > + } else if (PageTransHuge(page)) { > struct page *thp; > > thp = alloc_pages_node(node, > diff --git a/mm/migrate.c b/mm/migrate.c > index 824c22e..30217537 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -1544,10 +1544,15 @@ struct page *new_page_nodemask(struct page *page, > unsigned int order = 0; > struct page *new_page = NULL; > > - if (PageHuge(page)) > - return alloc_huge_page_nodemask( > - page_hstate(compound_head(page)), > - preferred_nid, nodemask); > + if (PageHuge(page)) { > + struct hstate *h = page_hstate(compound_head(page)); > + struct alloc_control ac = { > + .nid = preferred_nid, > + .nmask = nodemask, > + }; > + > + return alloc_huge_page_nodemask(h, &ac); > + } > > if (PageTransHuge(page)) { > gfp_mask |= GFP_TRANSHUGE; > -- > 2.7.4 > -- Michal Hocko SUSE Labs