From: Srikar Dronamraju <firstname.lastname@example.org> To: Michal Hocko <email@example.com> Cc: David Hildenbrand <firstname.lastname@example.org>, Andrew Morton <email@example.com>, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, Mel Gorman <email@example.com>, Vlastimil Babka <firstname.lastname@example.org>, "Kirill A. Shutemov" <email@example.com>, Christopher Lameter <firstname.lastname@example.org>, Michael Ellerman <email@example.com>, Linus Torvalds <firstname.lastname@example.org>, Gautham R Shenoy <email@example.com>, Satheesh Rajendran <firstname.lastname@example.org> Subject: Re: [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline Date: Thu, 2 Jul 2020 20:02:27 +0530 [thread overview] Message-ID: <20200702143227.GE17918@linux.vnet.ibm.com> (raw) In-Reply-To: <20200702084123.GC18446@dhcp22.suse.cz> * Michal Hocko <email@example.com> [2020-07-02 10:41:23]: > On Thu 02-07-20 12:14:08, Srikar Dronamraju wrote: > > * Michal Hocko <firstname.lastname@example.org> [2020-07-01 14:21:10]: > > > > > > >>>>> The autonuma problem sounds interesting but again this patch doesn't > > > > >>>>> really solve the underlying problem because I strongly suspect that the > > > > >>>>> problem is still there when a numa node gets all its memory offline as > > > > >>>>> mentioned above. > > > > > > I would really appreciate a feedback to these two as well. > > > > 1. Its not just numactl that's to be fixed but all tools/utilities that > > depend on /sys/devices/system/node/online. Are we saying to not rely/believe > > in the output given by the kernel but do further verification? > > No, what we are saying is that even an online node might have zero > number of online pages/cpus. So the online status is not really > something that matters. If people are confused by that output then user > space tools can make their confusion go away. I really do not understand > why the kernel should do any logic there. The user facing teams are saying they are getting queries from the users who are unable to understand from the tools/sysfs files why a node is online and but has no attached resources. Its the amount of time that is being spent on these issues that triggered the patch. Initially even I was skeptical that this was a non-issue. > > > Also how would the user space differentiate between the case where the > > Kernel missed marking a node as offline to the case where the memory was > > offlined on a cpuless node but node wasn't offline?. > > What I am arguing is that those two shouldn't be any different. Really! > > > 2. Regarding the autonuma, the case of offline memory is user/admin driven, > > so if there is a performance hit, its something that's driven by his > > user/admin actions. Also how often do we see users offline complete memory > > of cpuless node on a 2 node system? > > How often do we see crippled HW configurations like that? Really if > autonuma should be made more clever for one case it should recognize the > other as well. > Lets take a 16 socket PowerVM system and assume that 32 lpars are created on that socket, i.e 2 lpars for each socket. (PowerVM has the final say on how the lpars are created.) In such a case, we can expect 30 out of the 32 lpars to face this problem, with the only 2 lpars that actually run on socket 0 having the correct configuration. > > > > > > This begs a question whether ppc can do the same thing? > > > > Certainly ppc can be made to adapt to this situation but that would be a > > workaround. Do we have a reason why we think node 0 is unique and special? > > It is not. As replied in other email in this thread. I would hope for > having less hacks in the numa initialization. Cleaning up the mess is > would be a lot of work and testing on all NUMA capable architectures. > This is a heritage from the past I am afraid. All that I am arguing here > is that your touch to the generic code with a very simple looking patch > might have side effects which are pretty much impossible to review. > Moreover it seems that nothing but ppc really needs this treatment. > So fixing it in ppc specific code sounds much more safe. > > Normally I would really push for a generic solution but after getting > burned several times in this area I do not dare anymore. The problem is > not in the code complexity but in how spread it is in places where you > do not expect side effects. > I do understand and respect your viewpoint. > -- > Michal Hocko > SUSE Labs -- Thanks and Regards Srikar Dronamraju
next prev parent reply other threads:[~2020-07-02 14:36 UTC|newest] Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-06-24 9:28 [PATCH v5 0/3] Offline memoryless cpuless node 0 Srikar Dronamraju 2020-06-24 9:28 ` [PATCH v5 1/3] powerpc/numa: Set numa_node for all possible cpus Srikar Dronamraju 2020-06-24 9:48 ` Gautham R Shenoy 2020-06-24 9:28 ` [PATCH v5 2/3] powerpc/numa: Prefer node id queried from vphn Srikar Dronamraju 2020-06-24 10:29 ` Gautham R Shenoy 2020-06-24 9:28 ` [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline Srikar Dronamraju 2020-06-29 14:58 ` Christopher Lameter 2020-06-30 4:01 ` Srikar Dronamraju 2020-07-01 12:23 ` Michal Hocko 2020-07-01 8:42 ` Michal Hocko 2020-07-01 10:04 ` Srikar Dronamraju 2020-07-01 10:15 ` David Hildenbrand 2020-07-01 11:01 ` Srikar Dronamraju 2020-07-01 11:06 ` David Hildenbrand 2020-07-01 11:30 ` David Hildenbrand 2020-07-01 12:21 ` Michal Hocko 2020-07-02 6:44 ` Srikar Dronamraju 2020-07-02 8:41 ` Michal Hocko 2020-07-02 14:32 ` Srikar Dronamraju [this message] 2020-07-03 9:10 ` Michal Suchánek 2020-07-03 9:24 ` Michal Hocko 2020-07-03 10:59 ` Michal Hocko 2020-07-03 11:32 ` David Hildenbrand 2020-07-03 11:46 ` Michal Hocko 2020-07-03 12:58 ` Srikar Dronamraju 2020-08-07 4:32 ` Andrew Morton 2020-08-07 6:58 ` David Hildenbrand 2020-08-07 10:04 ` Michal Suchánek 2020-08-12 6:01 ` Srikar Dronamraju 2020-08-18 7:32 ` David Hildenbrand 2020-08-18 7:37 ` Michal Hocko 2020-08-18 7:49 ` Srikar Dronamraju 2020-07-06 16:08 ` Andi Kleen
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200702143227.GE17918@linux.vnet.ibm.com \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).