LinuxPPC-Dev Archive on lore.kernel.org
 help / color / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: dalias@libc.org, linux-sh@vger.kernel.org,
	catalin.marinas@arm.com, dave.hansen@linux.intel.com,
	heiko.carstens@de.ibm.com, jiaxun.yang@flygoat.com,
	linux-mips@vger.kernel.org, mwb@linux.vnet.ibm.com,
	paulus@samba.org, hpa@zytor.com, sparclinux@vger.kernel.org,
	chenhc@lemote.com, will@kernel.org, cai@lca.pw,
	linux-s390@vger.kernel.org, ysato@users.sourceforge.jp,
	x86@kernel.org, Yunsheng Lin <linyunsheng@huawei.com>,
	rppt@linux.ibm.com, borntraeger@de.ibm.com, dledford@redhat.com,
	mingo@redhat.com, jeffrey.t.kirsher@intel.com, jhogan@kernel.org,
	mattst88@gmail.com, len.brown@intel.com, gor@linux.ibm.com,
	anshuman.khandual@arm.com, gregkh@linuxfoundation.org,
	bp@alien8.de, luto@kernel.org, tglx@linutronix.de,
	naveen.n.rao@linux.vnet.ibm.com,
	linux-arm-kernel@lists.infradead.org, rth@twiddle.net,
	axboe@kernel.dk, linuxppc-dev@lists.ozlabs.org,
	linux-kernel@vger.kernel.org, ralf@linux-mips.org,
	tbogendoerfer@suse.de, paul.burton@mips.com,
	linux-alpha@vger.kernel.org, rafael@kernel.org,
	ink@jurassic.park.msu.ru, akpm@linux-foundation.org,
	robin.murphy@arm.com, davem@davemloft.net
Subject: Re: [PATCH v6] numa: make node_to_cpumask_map() NUMA_NO_NODE aware
Date: Mon, 23 Sep 2019 18:52:35 +0200
Message-ID: <20190923165235.GD17206@dhcp22.suse.cz> (raw)
In-Reply-To: <20190923154852.GG2369@hirez.programming.kicks-ass.net>

On Mon 23-09-19 17:48:52, Peter Zijlstra wrote:
> On Mon, Sep 23, 2019 at 05:28:56PM +0200, Michal Hocko wrote:
> > On Mon 23-09-19 17:15:19, Peter Zijlstra wrote:
> 
> > > > diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> > > > index 4123100e..9859acb 100644
> > > > --- a/arch/x86/mm/numa.c
> > > > +++ b/arch/x86/mm/numa.c
> > > > @@ -861,6 +861,9 @@ void numa_remove_cpu(int cpu)
> > > >   */
> > > >  const struct cpumask *cpumask_of_node(int node)
> > > >  {
> > > > +	if (node == NUMA_NO_NODE)
> > > > +		return cpu_online_mask;
> > > 
> > > This mandates the caller holds cpus_read_lock() or something, I'm pretty
> > > sure that if I put:
> > > 
> > > 	lockdep_assert_cpus_held();
> > 
> > Is this documented somewhere?
> 
> No idea... common sense :-)

I thought that and cpuhotplug were forbiden to be used in the same
sentence :p

> > Also how does that differ from a normal
> > case when a proper node is used? The cpumask will always be dynamic in
> > the cpu hotplug presence, right?
> 
> As per normal yes, and I'm fairly sure there's a ton of bugs. Any
> 'online' state is subject to change except when you're holding
> sufficient locks to stop it.
> 
> Disabling preemption also stabilizes it, because cpu unplug relies on
> stop-machine.

OK, I guess it is fair to document that callers should be careful when
using this if they absolutely need any stability. But I strongly suspect
they simply do not care all that much. They mostly do care to have
something that gives them an idea which CPUs are close to the device and
that can tolerate some race.

In other words this is more of an optimization than a correctness issue.
 
> > > here, it comes apart real quick. Without holding the cpu hotplug lock,
> > > the online mask is gibberish.
> > 
> > Can the returned cpu mask go away?
> 
> No, the cpu_online_mask itself has static storage, the contents OTOH can
> change at will. Very little practical difference :-)
 
OK, thanks for the confirmation. I was worried that I've overlooked
something.

To the NUMA_NO_NODE itself. Your earlier email noted:
: > +
: >  	if ((unsigned)node >= nr_node_ids) {
: >  		printk(KERN_WARNING
: >  			"cpumask_of_node(%d): (unsigned)node >= nr_node_ids(%u)\n",
: 
: I still think this makes absolutely no sense what so ever.

Did you mean the NUMA_NO_NODE handling or the specific node >= nr_node_ids
check?

Because as to NUMA_NO_NODE I believe this makes sense because this is
the only way that a device is not bound to any numa node. I even the
ACPI standard is considering this optional. Yunsheng Lin has referred to
the specific part of the standard in one of the earlier discussions.
Trying to guess the node affinity is worse than providing all CPUs IMHO.
-- 
Michal Hocko
SUSE Labs

  reply index

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-17 12:48 Yunsheng Lin
2019-09-21 22:38 ` Paul Burton
2019-09-23  2:31   ` Yunsheng Lin
2019-09-23 15:15 ` Peter Zijlstra
2019-09-23 15:28   ` Michal Hocko
2019-09-23 15:48     ` Peter Zijlstra
2019-09-23 16:52       ` Michal Hocko [this message]
2019-09-23 20:34         ` Peter Zijlstra
2019-09-24  1:29           ` Yunsheng Lin
2019-09-24  9:25             ` Peter Zijlstra
2019-09-24 11:07               ` Yunsheng Lin
2019-09-24 11:28                 ` Peter Zijlstra
2019-09-24 11:44                   ` Yunsheng Lin
2019-09-24 11:58                     ` Peter Zijlstra
2019-09-24 12:09                       ` Yunsheng Lin
2019-09-24  7:47           ` Michal Hocko
2019-09-24  9:17             ` Peter Zijlstra
2019-09-24 10:56               ` Michal Hocko
2019-09-24 11:23                 ` Peter Zijlstra
2019-09-24 11:54                   ` Michal Hocko
2019-09-24 12:09                     ` Peter Zijlstra
2019-09-24 12:25                       ` Michal Hocko
2019-09-24 12:43                         ` Peter Zijlstra
2019-09-24 12:59                           ` Peter Zijlstra
2019-09-24 13:19                             ` Michal Hocko
2019-09-25  9:14                               ` Yunsheng Lin
2019-09-25 10:41                                 ` Peter Zijlstra
2019-10-08  8:38                                   ` Yunsheng Lin
2019-10-09 12:25                                     ` Robin Murphy
2019-10-10  6:07                                       ` Yunsheng Lin
2019-10-10  7:32                                         ` Michal Hocko
2019-10-11  3:27                                           ` Yunsheng Lin
2019-10-11 11:15                                             ` Peter Zijlstra
2019-10-12  6:17                                               ` Yunsheng Lin
2019-10-12  7:40                                                 ` Greg KH
2019-10-12  9:47                                                   ` Yunsheng Lin
2019-10-12 10:40                                                     ` Greg KH
2019-10-12 10:47                                                       ` Greg KH
2019-10-14  8:00                                                         ` Yunsheng Lin
2019-10-14  9:25                                                           ` Greg KH
2019-10-14  9:49                                                             ` Peter Zijlstra
2019-10-14 10:04                                                               ` Greg KH
2019-10-15 10:40                                                             ` Yunsheng Lin
2019-10-15 16:58                                                               ` Greg KH
2019-10-16 12:07                                                                 ` Yunsheng Lin
2019-10-28  9:20                                                   ` Yunsheng Lin
2019-10-29  8:53                                                     ` Michal Hocko
2019-10-30  1:58                                                       ` Yunsheng Lin
2019-10-10  8:56                                       ` Peter Zijlstra
2019-09-25 10:40                               ` Peter Zijlstra
2019-09-25 13:25                                 ` Michal Hocko
2019-09-25 16:31                                   ` Peter Zijlstra
2019-09-25 21:45                                     ` Peter Zijlstra
2019-09-26  9:05                                       ` Peter Zijlstra
2019-09-26 12:10                                         ` Peter Zijlstra
2019-09-26 11:45                                     ` Geert Uytterhoeven
2019-09-26 12:24                                       ` Peter Zijlstra

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190923165235.GD17206@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=axboe@kernel.dk \
    --cc=borntraeger@de.ibm.com \
    --cc=bp@alien8.de \
    --cc=cai@lca.pw \
    --cc=catalin.marinas@arm.com \
    --cc=chenhc@lemote.com \
    --cc=dalias@libc.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=dledford@redhat.com \
    --cc=gor@linux.ibm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=heiko.carstens@de.ibm.com \
    --cc=hpa@zytor.com \
    --cc=ink@jurassic.park.msu.ru \
    --cc=jeffrey.t.kirsher@intel.com \
    --cc=jhogan@kernel.org \
    --cc=jiaxun.yang@flygoat.com \
    --cc=len.brown@intel.com \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=linyunsheng@huawei.com \
    --cc=luto@kernel.org \
    --cc=mattst88@gmail.com \
    --cc=mingo@redhat.com \
    --cc=mwb@linux.vnet.ibm.com \
    --cc=naveen.n.rao@linux.vnet.ibm.com \
    --cc=paul.burton@mips.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=ralf@linux-mips.org \
    --cc=robin.murphy@arm.com \
    --cc=rppt@linux.ibm.com \
    --cc=rth@twiddle.net \
    --cc=sparclinux@vger.kernel.org \
    --cc=tbogendoerfer@suse.de \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=ysato@users.sourceforge.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LinuxPPC-Dev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linuxppc-dev/0 linuxppc-dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linuxppc-dev linuxppc-dev/ https://lore.kernel.org/linuxppc-dev \
		linuxppc-dev@lists.ozlabs.org linuxppc-dev@ozlabs.org
	public-inbox-index linuxppc-dev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.ozlabs.lists.linuxppc-dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git