From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6495C64E69 for ; Mon, 23 Nov 2020 12:43:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4B2F020857 for ; Mon, 23 Nov 2020 12:43:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="BlokZ/Pz" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733083AbgKWMn1 (ORCPT ); Mon, 23 Nov 2020 07:43:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57490 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732647AbgKWMlS (ORCPT ); Mon, 23 Nov 2020 07:41:18 -0500 Received: from mail-pl1-x644.google.com (mail-pl1-x644.google.com [IPv6:2607:f8b0:4864:20::644]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 29CE1C0613CF for ; Mon, 23 Nov 2020 04:41:19 -0800 (PST) Received: by mail-pl1-x644.google.com with SMTP id l11so8798129plt.1 for ; Mon, 23 Nov 2020 04:41:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=x/P5N+aOgL+Wbp5LDW9WUHH3BDfKq+6QZULV8RGl/RI=; b=BlokZ/Pz5XeHpSjC+MSruML4P8RkddSgJRa8c6tIXeWoh16b9Wj6x87Q9DtCPdMwzV 15wElAT5ukt5h5AZbTm433f/zV+j1s42ItZqY+ezTxmefPyn+cDUFO+938Y5fNBm6stH Y7PCY3iuT3vDa+XurhzV5FRb96TjlAX4yW/Y+L8PuG41s/L4qs6rhCsi2F7u7XmSzFG6 RCTbIlXQYthfLHcvJ/vghfpwJJf2qQ3brYryXbJA7dVOUZQ/ColzuRCb9K1wZY+Rj88B 2noNhzOhd4CjBMGVd5OyV/uudyuj5lq5gXHufcTlQPJr3TJ2ZqpGlY7hMLHOTUYbwb29 IAWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=x/P5N+aOgL+Wbp5LDW9WUHH3BDfKq+6QZULV8RGl/RI=; b=rggGPiSTyNRt6S/H7RaXtYWCuW84K6D6kspW1XuvftYESFbEQt3sqHmQUo3Rf7Uk8n 6D3EW1l1/yUEraJw2TMKgHYhG4tzTgPoVPcTguZXZL1UOOIiNJ/NV4dtM8igvJPcIBLM t3R1TuKpXzRVLdSm8C4je1LqOH1LV2HiR8axjuA/gB2BAoS9PN5D3wcWPr+JxgsjToum QfrNEw+9l8oz3HU9NnjFlnZiDluadRi4zs3OomrCRB40qko12e+/48cbNjA1W+zdff5h /HR70Q0VCMDdvCyPEDW/EuxGcWSh8uuq75frvQlsqT3cPYvkZOegMmoMZJTVDoEQljan jfeg== X-Gm-Message-State: AOAM530mRLUf8YTNjiQJREklildhUIBh7I3qn+9NIcMgcxAioogzBTrX g0xPPf9BBDGo4D5Fp1PUGNtnTk9Gaqb6jEwcrYaEHw== X-Google-Smtp-Source: ABdhPJz9qr7a0zkKVT9IfcMthloG1hw4ZSqZsLVrN/ctqcHYcVJjOjMVeC/6+5Fwvq+i7o9PgEnq9csgleUeS3Fpk10= X-Received: by 2002:a17:902:76c8:b029:d9:d6c3:357d with SMTP id j8-20020a17090276c8b02900d9d6c3357dmr18683104plt.34.1606135278722; Mon, 23 Nov 2020 04:41:18 -0800 (PST) MIME-Version: 1.0 References: <20201120131129.GO3200@dhcp22.suse.cz> <20201123074046.GB27488@dhcp22.suse.cz> <20201123094344.GG27488@dhcp22.suse.cz> <20201123104258.GJ27488@dhcp22.suse.cz> <20201123113208.GL27488@dhcp22.suse.cz> <20201123121842.GM27488@dhcp22.suse.cz> In-Reply-To: <20201123121842.GM27488@dhcp22.suse.cz> From: Muchun Song Date: Mon, 23 Nov 2020 20:40:40 +0800 Message-ID: Subject: Re: [External] Re: [PATCH v5 00/21] Free some vmemmap pages of hugetlb page To: Michal Hocko Cc: Jonathan Corbet , Mike Kravetz , Thomas Gleixner , mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, Peter Zijlstra , viro@zeniv.linux.org.uk, Andrew Morton , paulmck@kernel.org, mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com, Randy Dunlap , oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de, Mina Almasry , David Rientjes , Matthew Wilcox , Oscar Salvador , "Song Bao Hua (Barry Song)" , Xiongchun duan , linux-doc@vger.kernel.org, LKML , Linux Memory Management List , linux-fsdevel Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 23, 2020 at 8:18 PM Michal Hocko wrote: > > On Mon 23-11-20 20:07:23, Muchun Song wrote: > > On Mon, Nov 23, 2020 at 7:32 PM Michal Hocko wrote: > [...] > > > > > > > No I really mean that pfn_to_page will give you a struct page pointer > > > > > > > from pages which you release from the vmemmap page tables. Those pages > > > > > > > might get reused as soon sa they are freed to the page allocator. > > > > > > > > > > > > We will remap vmemmap pages 2-7 (virtual addresses) to page > > > > > > frame 1. And then we free page frame 2-7 to the buddy allocator. > > > > > > > > > > And this doesn't really happen in an atomic fashion from the pfn walker > > > > > POV, right? So it is very well possible that > > > > > > > > Yeah, you are right. But it may not be a problem for HugeTLB pages. > > > > Because in most cases, we only read the tail struct page and get the > > > > head struct page through compound_head() when the pfn is within > > > > a HugeTLB range. Right? > > > > > > Many pfn walkers would encounter the head page first and then skip over > > > the rest. Those should be reasonably safe. But there is no guarantee and > > > the fact that you need a valid page->compound_head which might get > > > scribbled over once you have the struct page makes this extremely > > > subtle. > > > > In this patch series, we can guarantee that the page->compound_head > > is always valid. Because we reuse the first tail page. Maybe you need to > > look closer at this series. Thanks. > > I must be really terrible exaplaining my concern. Let me try one last > time. It is really _irrelevant_ what you do with tail pages. The > underlying problem is that you are changing struct pages under users > without any synchronization. What used to be a valid struct page will > turn into garbage as soon as you remap vmemmap page tables. Thank you very much for your patient explanation. So if the pfn walkers always try get the head struct page through compound_head() when it encounter a tail struct page. There will be no concerns. Do you agree? > -- > Michal Hocko > SUSE Labs -- Yours, Muchun