From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06870C433E0 for ; Wed, 10 Feb 2021 14:10:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AB20A64E63 for ; Wed, 10 Feb 2021 14:10:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230486AbhBJOK3 (ORCPT ); Wed, 10 Feb 2021 09:10:29 -0500 Received: from mx2.suse.de ([195.135.220.15]:52336 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230061AbhBJOK0 (ORCPT ); Wed, 10 Feb 2021 09:10:26 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 8E021AC97; Wed, 10 Feb 2021 14:09:43 +0000 (UTC) Date: Wed, 10 Feb 2021 15:09:41 +0100 From: Oscar Salvador To: David Hildenbrand Cc: Mike Kravetz , Muchun Song , Michal Hocko , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH 1/2] mm,page_alloc: Make alloc_contig_range handle in-use hugetlb pages Message-ID: <20210210140941.GA3636@localhost.localdomain> References: <20210208103812.32056-1-osalvador@suse.de> <20210208103812.32056-2-osalvador@suse.de> <6aa21eb3-7bee-acff-8f3c-7c13737066ba@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6aa21eb3-7bee-acff-8f3c-7c13737066ba@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 10, 2021 at 09:56:37AM +0100, David Hildenbrand wrote: > On 08.02.21 11:38, Oscar Salvador wrote: > > alloc_contig_range is not prepared to handle hugetlb pages and will > > fail if it ever sees one, but since they can be migrated as any other > > page (LRU and Movable), it makes sense to also handle them. > > > > For now, do it only when coming from alloc_contig_range. > > > > Signed-off-by: Oscar Salvador > > --- > > mm/compaction.c | 17 +++++++++++++++++ > > mm/vmscan.c | 5 +++-- > > 2 files changed, 20 insertions(+), 2 deletions(-) > > > > diff --git a/mm/compaction.c b/mm/compaction.c > > index e5acb9714436..89cd2e60da29 100644 > > --- a/mm/compaction.c > > +++ b/mm/compaction.c > > @@ -940,6 +940,22 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > > goto isolate_fail; > > } > > + /* > > + * Handle hugetlb pages only when coming from alloc_contig > > + */ > > + if (PageHuge(page) && cc->alloc_contig) { > > + if (page_count(page)) { > > I wonder if we should care about races here. What if someone concurrently > allocates/frees? > > Note that PageHuge() succeeds on tail pages, isolate_huge_page() not, i > assume we'll have to handle that as well. > > I wonder if it would make sense to move some of the magic to hugetlb code > and handle it there with less chances for races (isolate if used, > alloc-and-dissolve if not). Yes, it makes sense to keep the magic in hugetlb code. Note, though, that removing all races might be tricky. isolate_huge_page() checks for PageHuge under hugetlb_lock, so there is a race between a call to PageHuge(x) and a subsequent call to isolate_huge_page(). But we should be fine as isolate_huge_page will fail in case the page is no longer HugeTLB. Also, since isolate_migratepages_block() gets called with ranges pageblock aligned, we should never be handling tail pages in the core of the function. E.g: the same way we handle THP: /* The whole page is taken off the LRU; skip the tail pages. */ if (PageCompound(page)) low_pfn += compound_nr(page) - 1; But all in all, the code has to be more bullet-proof. This RFC was more like a PoC to see whether something crazy was done. And as I said, moving the handling of hugetlb pages to hugetlb.c might help towards a better error-race-handling. Thanks for having a look ;-) -- Oscar Salvador SUSE L3