From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08581C433E6 for ; Thu, 25 Mar 2021 12:27:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E78A461A38 for ; Thu, 25 Mar 2021 12:27:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230489AbhCYM06 (ORCPT ); Thu, 25 Mar 2021 08:26:58 -0400 Received: from mx2.suse.de ([195.135.220.15]:47356 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230406AbhCYM0h (ORCPT ); Thu, 25 Mar 2021 08:26:37 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1616675196; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=HS/ApyYJx+KXsB6jVo02Qx7J6vlQhgTkv4jPcOfGutk=; b=aJ0JArRv8L+zMYHlkE6qOXTCAy48TVCl7oBHDbtOFdBiYVTF7651q8BweU7nkYWZhzDkR9 GiwgTq4NaZ9Rtk3omYPy8cFGV/jU/rx9AEQGmZuboLLGiFQ1bGM4F3enWCEwhSlhC7Hy0z t5oUgjDHBbgj7/R6LBz+yEU8Jq0+G44= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 22DABAD8A; Thu, 25 Mar 2021 12:26:36 +0000 (UTC) Date: Thu, 25 Mar 2021 13:26:34 +0100 From: Michal Hocko To: Oscar Salvador Cc: David Hildenbrand , Andrew Morton , Anshuman Khandual , Vlastimil Babka , Pavel Tatashin , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v5 1/5] mm,memory_hotplug: Allocate memmap from the added memory range Message-ID: References: <20210324101259.GB16560@linux> <3bc4168c-fd31-0c9a-44ac-88e25d524eef@redhat.com> <9591a0b8-c000-2f61-67a6-4402678fe50b@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 25-03-21 11:55:01, Oscar Salvador wrote: > On Thu, Mar 25, 2021 at 10:17:33AM +0100, Michal Hocko wrote: > > Why do you think it is wrong to initialize/account pages when they are > > used? Keep in mind that offline pages are not used until they are > > onlined. But vmemmap pages are used since the vmemmap is established > > which happens in the hotadd stage. > > Yes, that is true. > vmemmap pages are used right when we populate the vmemmap space. > > > > plus the fact that I dislike to place those pages in > > > ZONE_NORMAL, although they are not movable. > > > But I think the vmemmap pages should lay within the same zone the pages > > > they describe, doing so simplifies things, and I do not see any outright > > > downside. > > > > Well, both ways likely have its pros and cons. Nevertheless, if the > > vmemmap storage is independent (which is the case for normal hotplug) > > then the state is consistent over hotadd, {online, offline} N times, > > hotremove cycles. Which is conceptually reasonable as vmemmap doesn't > > go away on each offline. > > > > If you are going to bind accounting to the online/offline stages then > > the accounting changes each time you go through the cycle and depending > > on the onlining type it would travel among zones. I find it quite > > confusing as the storage for vmemmap hasn't changed any of its > > properties. > > That is a good point I guess. > vmemmap pages do not really go away until the memory is unplugged. > > But I see some questions to raise: > > - As I said, I really dislike it tiding vmemmap memory to ZONE_NORMAL > unconditionally and this might result in the problems David mentioned. > I remember David and I discussed such problems but the problems with > zones not being contiguos have also been discussed in the past and > IIRC, we reached the conclusion that a maximal effort should be made > to keep them that way, otherwise other things suffer e.g: compaction > code. Yeah, David has raised the contiguous flag for zone already. And to be completely honest I fail to see why we should shape a design based on an optimization. If anything we can teach set_zone_contiguous to simply ignore zone affiliation of vmemmap pages. I would be really curious if that would pose any harm to the compaction code as they are reserved and compaction should simply skip them. > So if we really want to move the initialization/account to the > hot-add/hot-remove stage, I would really like to be able to set the > proper zone in there (that is, the same zone where the memory will lay). THere is nothing like a proper zone. > - When moving the initialization/accounting to hot-add/hot-remove, > the section containing the vmemmap pages will remain offline. Yes this sucks! I do not have a good answer for that as the online/offline granularity seems rather coarse on that. > It might get onlined once the pages get online in online_pages(), > or not if vmemmap pages span a whole section. > I remember (but maybe David rmemeber better) that that was a problem > wrt. pfn_to_online_page() and hybernation/kdump. > So, if that is really a problem, we would have to care of ot setting > the section to the right state. > > - AFAICS, doing all the above brings us to former times were some > initialization/accounting was done in a previous stage, and I remember > it was pushed hard to move those in online/offline_pages(). Not sure what you are referring to but if you have prior to f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") then this was entirely a different story. Users do care where they memory goes because that depends on the usecase but do they care about vmemmap? -- Michal Hocko SUSE Labs