From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9699C6379D for ; Mon, 23 Nov 2020 12:41:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 76A3F208DB for ; Mon, 23 Nov 2020 12:41:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="BlokZ/Pz" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732659AbgKWMlX (ORCPT ); Mon, 23 Nov 2020 07:41:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57494 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732652AbgKWMlT (ORCPT ); Mon, 23 Nov 2020 07:41:19 -0500 Received: from mail-pl1-x642.google.com (mail-pl1-x642.google.com [IPv6:2607:f8b0:4864:20::642]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B58FC061A4E for ; Mon, 23 Nov 2020 04:41:19 -0800 (PST) Received: by mail-pl1-x642.google.com with SMTP id bj5so7966531plb.4 for ; Mon, 23 Nov 2020 04:41:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=x/P5N+aOgL+Wbp5LDW9WUHH3BDfKq+6QZULV8RGl/RI=; b=BlokZ/Pz5XeHpSjC+MSruML4P8RkddSgJRa8c6tIXeWoh16b9Wj6x87Q9DtCPdMwzV 15wElAT5ukt5h5AZbTm433f/zV+j1s42ItZqY+ezTxmefPyn+cDUFO+938Y5fNBm6stH Y7PCY3iuT3vDa+XurhzV5FRb96TjlAX4yW/Y+L8PuG41s/L4qs6rhCsi2F7u7XmSzFG6 RCTbIlXQYthfLHcvJ/vghfpwJJf2qQ3brYryXbJA7dVOUZQ/ColzuRCb9K1wZY+Rj88B 2noNhzOhd4CjBMGVd5OyV/uudyuj5lq5gXHufcTlQPJr3TJ2ZqpGlY7hMLHOTUYbwb29 IAWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=x/P5N+aOgL+Wbp5LDW9WUHH3BDfKq+6QZULV8RGl/RI=; b=hSDTkWeMmPyem+0nOhEoHkR5B3HdEq/E3ErCmzibzugtPskeOQZ2lLO44jaH7b4CWy ASXe3xw1RCUt1K4hUjmZ9VbOTdr7pAkSCpPdq8HvfOjKubCrVUB0peWBFXidK3/HK5pH WMMYOj+4H3QA23qDsIHwUqIx3hlmkDlyRA2RCM/DRFBlzEjpfHAj+sdbiocaNLiQHOGJ tbjXzO/xl+ULka7RaIkaEzXTYpDIUWsmi1aWG9NeJbp3uDKpKmepR1HYgyEAz/DYSVRO rYIBxvlwyJKbdJgglbA+of4yH7sE06QzpjiFak+pDzu4w/HNzmIbh8NpbFz9DAu1Wg6L ZHfA== X-Gm-Message-State: AOAM530uYLzVvS7FynjHvWdF/x7OsAbLajJaPzB2iYzrymj+fQ0qe/ff mQI0+ZEv+KmAVweige/Yulpvar53MCkFYaY+TUBcbg== X-Google-Smtp-Source: ABdhPJz9qr7a0zkKVT9IfcMthloG1hw4ZSqZsLVrN/ctqcHYcVJjOjMVeC/6+5Fwvq+i7o9PgEnq9csgleUeS3Fpk10= X-Received: by 2002:a17:902:76c8:b029:d9:d6c3:357d with SMTP id j8-20020a17090276c8b02900d9d6c3357dmr18683104plt.34.1606135278722; Mon, 23 Nov 2020 04:41:18 -0800 (PST) MIME-Version: 1.0 References: <20201120131129.GO3200@dhcp22.suse.cz> <20201123074046.GB27488@dhcp22.suse.cz> <20201123094344.GG27488@dhcp22.suse.cz> <20201123104258.GJ27488@dhcp22.suse.cz> <20201123113208.GL27488@dhcp22.suse.cz> <20201123121842.GM27488@dhcp22.suse.cz> In-Reply-To: <20201123121842.GM27488@dhcp22.suse.cz> From: Muchun Song Date: Mon, 23 Nov 2020 20:40:40 +0800 Message-ID: Subject: Re: [External] Re: [PATCH v5 00/21] Free some vmemmap pages of hugetlb page To: Michal Hocko Cc: Jonathan Corbet , Mike Kravetz , Thomas Gleixner , mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, Peter Zijlstra , viro@zeniv.linux.org.uk, Andrew Morton , paulmck@kernel.org, mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com, Randy Dunlap , oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de, Mina Almasry , David Rientjes , Matthew Wilcox , Oscar Salvador , "Song Bao Hua (Barry Song)" , Xiongchun duan , linux-doc@vger.kernel.org, LKML , Linux Memory Management List , linux-fsdevel Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Mon, Nov 23, 2020 at 8:18 PM Michal Hocko wrote: > > On Mon 23-11-20 20:07:23, Muchun Song wrote: > > On Mon, Nov 23, 2020 at 7:32 PM Michal Hocko wrote: > [...] > > > > > > > No I really mean that pfn_to_page will give you a struct page pointer > > > > > > > from pages which you release from the vmemmap page tables. Those pages > > > > > > > might get reused as soon sa they are freed to the page allocator. > > > > > > > > > > > > We will remap vmemmap pages 2-7 (virtual addresses) to page > > > > > > frame 1. And then we free page frame 2-7 to the buddy allocator. > > > > > > > > > > And this doesn't really happen in an atomic fashion from the pfn walker > > > > > POV, right? So it is very well possible that > > > > > > > > Yeah, you are right. But it may not be a problem for HugeTLB pages. > > > > Because in most cases, we only read the tail struct page and get the > > > > head struct page through compound_head() when the pfn is within > > > > a HugeTLB range. Right? > > > > > > Many pfn walkers would encounter the head page first and then skip over > > > the rest. Those should be reasonably safe. But there is no guarantee and > > > the fact that you need a valid page->compound_head which might get > > > scribbled over once you have the struct page makes this extremely > > > subtle. > > > > In this patch series, we can guarantee that the page->compound_head > > is always valid. Because we reuse the first tail page. Maybe you need to > > look closer at this series. Thanks. > > I must be really terrible exaplaining my concern. Let me try one last > time. It is really _irrelevant_ what you do with tail pages. The > underlying problem is that you are changing struct pages under users > without any synchronization. What used to be a valid struct page will > turn into garbage as soon as you remap vmemmap page tables. Thank you very much for your patient explanation. So if the pfn walkers always try get the head struct page through compound_head() when it encounter a tail struct page. There will be no concerns. Do you agree? > -- > Michal Hocko > SUSE Labs -- Yours, Muchun