From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE5FBC433E2 for ; Tue, 15 Sep 2020 18:15:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 03A04205F4 for ; Tue, 15 Sep 2020 18:15:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="g3b3ifw6" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 03A04205F4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6587F6B0099; Tue, 15 Sep 2020 14:15:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 609256B009A; Tue, 15 Sep 2020 14:15:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4F85D6B009B; Tue, 15 Sep 2020 14:15:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0196.hostedemail.com [216.40.44.196]) by kanga.kvack.org (Postfix) with ESMTP id 384006B0099 for ; Tue, 15 Sep 2020 14:15:52 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id EBA42362D for ; Tue, 15 Sep 2020 18:15:51 +0000 (UTC) X-FDA: 77266099302.17.mark09_5011d9927113 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin17.hostedemail.com (Postfix) with ESMTP id 5B04C180D0197 for ; Tue, 15 Sep 2020 18:15:51 +0000 (UTC) X-HE-Tag: mark09_5011d9927113 X-Filterd-Recvd-Size: 5562 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf15.hostedemail.com (Postfix) with ESMTP for ; Tue, 15 Sep 2020 18:15:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=eXN7kFAR+gEpJNTnaqlhtCgmixCtDEYWogMlLi09BOM=; b=g3b3ifw6Kcqd0noxwqMen2vSMe GaeV+rwhGta+j00CbRGZ2ThYXW7YmoKrXYSxCWGN5xwbcs9+0r02yI6vTLiEtiDIZZgc2+MqYRH6C MIhkSnVLyMtoG6kPM0lt72jNvlIOvb8wKBeTp/x65OyrO3UvheGhvlhvHoyaSn+yqsUfpVV3CTseW r/+Zt9EBkx4Q9FSU0TlVr+roYq1e/CgCgLInKsHmRZuO4QX7Xlz6UhUNkICF8ycSi4Lxg6C01VJqg NZTH7buqYS/nojHG42M6l71Z25z7F9ahPsBvn+Hb9GKG/2welJsm4xllPkGvGvEQJG7g17xWDDfQi PFiEJIJA==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1kIFUI-0004Cf-71; Tue, 15 Sep 2020 18:15:30 +0000 Date: Tue, 15 Sep 2020 19:15:30 +0100 From: Matthew Wilcox To: Muchun Song Cc: Jonathan Corbet , Mike Kravetz , Thomas Gleixner , mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, Peter Zijlstra , viro@zeniv.linux.org.uk, Andrew Morton , paulmck@kernel.org, mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com, Randy Dunlap , oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de, almasrymina@google.com, David Rientjes , linux-doc@vger.kernel.org, LKML , Linux Memory Management List , linux-fsdevel@vger.kernel.org Subject: Re: [External] Re: [RFC PATCH 00/24] mm/hugetlb: Free some vmemmap pages of hugetlb page Message-ID: <20200915181530.GL5449@casper.infradead.org> References: <20200915125947.26204-1-songmuchun@bytedance.com> <20200915143241.GH5449@casper.infradead.org> <20200915154213.GI5449@casper.infradead.org> <20200915173948.GK5449@casper.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 5B04C180D0197 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Sep 16, 2020 at 02:03:15AM +0800, Muchun Song wrote: > On Wed, Sep 16, 2020 at 1:39 AM Matthew Wilcox wrote: > > > > On Wed, Sep 16, 2020 at 01:32:46AM +0800, Muchun Song wrote: > > > On Tue, Sep 15, 2020 at 11:42 PM Matthew Wilcox wrote: > > > > > > > > On Tue, Sep 15, 2020 at 11:28:01PM +0800, Muchun Song wrote: > > > > > On Tue, Sep 15, 2020 at 10:32 PM Matthew Wilcox wrote: > > > > > > > > > > > > On Tue, Sep 15, 2020 at 08:59:23PM +0800, Muchun Song wrote: > > > > > > > This patch series will free some vmemmap pages(struct page structures) > > > > > > > associated with each hugetlbpage when preallocated to save memory. > > > > > > > > > > > > It would be lovely to be able to do this. Unfortunately, it's completely > > > > > > impossible right now. Consider, for example, get_user_pages() called > > > > > > on the fifth page of a hugetlb page. > > > > > > > > > > Can you elaborate on the problem? Thanks so much. > > > > > > > > OK, let's say you want to do a 2kB I/O to offset 0x5000 of a 2MB page > > > > on a 4kB base page system. Today, that results in a bio_vec containing > > > > {head+5, 0, 0x800}. Then we call page_to_phys() on that (head+5) struct > > > > page to get the physical address of the I/O, and we turn it into a struct > > > > scatterlist, which similarly has a reference to the page (head+5). > > > > > > As I know, in this case, the get_user_pages() will get a reference > > > to the head page (head+0) before returning such that the hugetlb > > > page can not be freed. Although get_user_pages() returns the > > > page (head+5) and the scatterlist has a reference to the page > > > (head+5), this patch series can handle this situation. I can not > > > figure out where the problem is. What I missed? Thanks. > > > > You freed pages 4-511 from the vmemmap so they could be used for > > something else. Page 5 isn't there any more. So if you return head+5, > > then when we complete the I/O, we'll look for the compound_head() of > > head+5 and we won't find head. > > We do not free pages 4-511 from the vmemmap. Actually, we only > free pages 128-511 from the vmemmap. > > The 512 struct pages occupy 8 pages of physical memory. We only > free 6 physical page frames to the buddy. But we will create a new > mapping just like below. The virtual address of the freed pages will > remap to the second page frame. So the second page frame is > reused. Oh! I get what you're doing now. For the vmemmap case, you free the last N-2 physical pages but map the second physical page multiple times. So for the 512 pages case, we see pages: abcdefgh | ijklmnop | ijklmnop | ijklmnop | ijklmnop | ijklmnop | ijklmnop ... Huh. I think that might work, except for PageHWPoison. I'll go back to your patch series and look at that some more.