From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 317E0C433DF for ; Fri, 3 Jul 2020 10:59:50 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E9FC920870 for ; Fri, 3 Jul 2020 10:59:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E9FC920870 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4E9718D006C; Fri, 3 Jul 2020 06:59:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 49A0A8D0066; Fri, 3 Jul 2020 06:59:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3AE0E8D006C; Fri, 3 Jul 2020 06:59:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0190.hostedemail.com [216.40.44.190]) by kanga.kvack.org (Postfix) with ESMTP id 265788D0066 for ; Fri, 3 Jul 2020 06:59:49 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id DB30C52A6 for ; Fri, 3 Jul 2020 10:59:48 +0000 (UTC) X-FDA: 76996469256.06.earth70_4d0647426e91 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin06.hostedemail.com (Postfix) with ESMTP id 9247F10094D44 for ; Fri, 3 Jul 2020 10:59:48 +0000 (UTC) X-HE-Tag: earth70_4d0647426e91 X-Filterd-Recvd-Size: 5731 Received: from mail-ej1-f65.google.com (mail-ej1-f65.google.com [209.85.218.65]) by imf37.hostedemail.com (Postfix) with ESMTP for ; Fri, 3 Jul 2020 10:59:48 +0000 (UTC) Received: by mail-ej1-f65.google.com with SMTP id dp18so33735592ejc.8 for ; Fri, 03 Jul 2020 03:59:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=9tKvM8/Cn/mYaxCxuR2xX8QmwiS4Do3drIeD4weqF7o=; b=o7Hm6iFeJXPOogqAuxJ3YvnaxkYWkimSlM4bBvUuKrtI8Np8cXPdKhFJ0lwbHmGozf UOwrAILzzgFAIxIrkdvpcmjoacMVOVCVz+Qj/z/E5EvkoFukPqkoMnLLJnJIvWOHuWqN AmxXJgpRAmTCvK3B0T2Jgpia0RsS4+cMrpu7TqPMXG6zwSAenmivQMR0M7wnT8svVDj0 n7aNS2BRiPnp2b1zQ6gfkJDBYNhUteIjxI94FyH/32gUHzyABgbKWsE86+EvSEScyJHD Xx070+gMzMj49MJhfZ8KcZxHtCSDvDWtvPWauvnIaUb5ArorDAl1J1JRw94t2vVRDQzc sM0Q== X-Gm-Message-State: AOAM531NrlNJPhw5oaKKBv7h5c8TCLrnp0i8+Nh6apALWM51Gx8srjZu frwP3yzLeI5q1e1e0Og/KWg= X-Google-Smtp-Source: ABdhPJypIBHztAW8b+oDXuc7i+rhK7i5IE4n3HBr9g6Z6wErcGVZfAUXJOhB8s0xROyaS+ZgyulOcA== X-Received: by 2002:a17:906:f911:: with SMTP id lc17mr32750815ejb.330.1593773987079; Fri, 03 Jul 2020 03:59:47 -0700 (PDT) Received: from localhost (ip-37-188-168-3.eurotel.cz. [37.188.168.3]) by smtp.gmail.com with ESMTPSA id m13sm9140663ejc.1.2020.07.03.03.59.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 Jul 2020 03:59:46 -0700 (PDT) Date: Fri, 3 Jul 2020 12:59:44 +0200 From: Michal Hocko To: Michal =?iso-8859-1?Q?Such=E1nek?= Cc: David Hildenbrand , Gautham R Shenoy , Srikar Dronamraju , Linus Torvalds , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Satheesh Rajendran , Mel Gorman , "Kirill A. Shutemov" , Andrew Morton , linuxppc-dev@lists.ozlabs.org, Christopher Lameter , Vlastimil Babka , Andi Kleen Subject: Re: [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline Message-ID: <20200703105944.GS18446@dhcp22.suse.cz> References: <20200624092846.9194-4-srikar@linux.vnet.ibm.com> <20200701084200.GN2369@dhcp22.suse.cz> <20200701100442.GB17918@linux.vnet.ibm.com> <184102af-ecf2-c834-db46-173ab2e66f51@redhat.com> <20200701110145.GC17918@linux.vnet.ibm.com> <0468f965-8762-76a3-93de-3987cf859927@redhat.com> <12945273-d788-710d-e8d7-974966529c7d@redhat.com> <20200701122110.GT2369@dhcp22.suse.cz> <20200703091001.GJ21462@kitsune.suse.cz> <20200703092414.GR18446@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200703092414.GR18446@dhcp22.suse.cz> X-Rspamd-Queue-Id: 9247F10094D44 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri 03-07-20 11:24:17, Michal Hocko wrote: > [Cc Andi] > > On Fri 03-07-20 11:10:01, Michal Suchanek wrote: > > On Wed, Jul 01, 2020 at 02:21:10PM +0200, Michal Hocko wrote: > > > On Wed 01-07-20 13:30:57, David Hildenbrand wrote: > [...] > > > > Yep, looks like it. > > > > > > > > [ 0.009726] SRAT: PXM 1 -> APIC 0x00 -> Node 0 > > > > [ 0.009727] SRAT: PXM 1 -> APIC 0x01 -> Node 0 > > > > [ 0.009727] SRAT: PXM 1 -> APIC 0x02 -> Node 0 > > > > [ 0.009728] SRAT: PXM 1 -> APIC 0x03 -> Node 0 > > > > [ 0.009731] ACPI: SRAT: Node 0 PXM 1 [mem 0x00000000-0x0009ffff] > > > > [ 0.009732] ACPI: SRAT: Node 0 PXM 1 [mem 0x00100000-0xbfffffff] > > > > [ 0.009733] ACPI: SRAT: Node 0 PXM 1 [mem 0x100000000-0x13fffffff] > > > > > > This begs a question whether ppc can do the same thing? > > Or x86 stop doing it so that you can see on what node you are running? > > > > What's the point of this indirection other than another way of avoiding > > empty node 0? > > Honestly, I do not have any idea. I've traced it down to > Author: Andi Kleen > Date: Tue Jan 11 15:35:48 2005 -0800 > > [PATCH] x86_64: Fix ACPI SRAT NUMA parsing > > Fix fallout from the recent nodemask_t changes. The node ids assigned > in the SRAT parser were off by one. > > I added a new first_unset_node() function to nodemask.h to allocate > IDs sanely. > > Signed-off-by: Andi Kleen > Signed-off-by: Linus Torvalds > > which doesn't really tell all that much. The historical baggage and a > long term behavior which is not really trivial to fix I suspect. Thinking about this some more, this logic makes some sense afterall. Especially in the world without memory hotplug which was very likely the case back then. It is much better to have compact node mask rather than sparse one. After all node numbers shouldn't really matter as long as you have a clear mapping to the HW. I am not sure we export that information (except for the kernel ring buffer) though. The memory hotplug changes that somehow because you can hotremove numa nodes and therefore make the nodemask sparse but that is not a common case. I am not sure what would happen if a completely new node was added and its corresponding node was already used by the renumbered one though. It would likely conflate the two I am afraid. But I am not sure this is really possible with x86 and a lack of a bug report would suggest that nobody is doing that at least. -- Michal Hocko SUSE Labs