All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yunsheng Lin <linyunsheng@huawei.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: dalias@libc.org, linux-sh@vger.kernel.org,
	catalin.marinas@arm.com, dave.hansen@linux.intel.com,
	heiko.carstens@de.ibm.com, linuxarm@huawei.com,
	jiaxun.yang@flygoat.com, linux-mips@vger.kernel.org,
	mwb@linux.vnet.ibm.com, paulus@samba.org, hpa@zytor.com,
	sparclinux@vger.kernel.org, chenhc@lemote.com, will@kernel.org,
	bp@alien8.de, linux-s390@vger.kernel.org,
	ysato@users.sourceforge.jp, mpe@ellerman.id.au, x86@kernel.org,
	rppt@linux.ibm.com, borntraeger@de.ibm.com, dledford@redhat.com,
	mingo@redhat.com, jeffrey.t.kirsher@intel.com,
	benh@kernel.crashing.org, jhogan@kernel.org,
	nfont@linux.vnet.ibm.com, mattst88@gmail.com,
	len.brown@intel.com, gor@linux.ibm.com,
	anshuman.khandual@arm.com, ink@jurassic.park.msu.ru,
	luto@kernel.org, tglx@linutronix.de,
	naveen.n.rao@linux.vnet.ibm.com,
	linux-arm-kernel@lists.infradead.org, rth@twiddle.net,
	axboe@kernel.dk, linuxppc-dev@lists.ozlabs.org,
	linux-kernel@vger.kernel.org, ralf@linux-mips.org,
	tbogendoerfer@suse.de, paul.burton@mips.com,
	linux-alpha@vger.kernel.org, cai@lca.pw,
	akpm@linux-foundation.org, robin.murphy@arm.com,
	davem@davemloft.net
Subject: Re: [PATCH v2 2/9] x86: numa: check the node id consistently for x86
Date: Mon, 02 Sep 2019 12:25:24 +0000	[thread overview]
Message-ID: <5fa2aa99-89fa-cd41-b090-36a23cfdeb73@huawei.com> (raw)
In-Reply-To: <20190902072542.GN2369@hirez.programming.kicks-ass.net>

On 2019/9/2 15:25, Peter Zijlstra wrote:
> On Mon, Sep 02, 2019 at 01:46:51PM +0800, Yunsheng Lin wrote:
>> On 2019/9/1 0:12, Peter Zijlstra wrote:
> 
>>> 1) because even it is not set, the device really does belong to a node.
>>> It is impossible a device will have magic uniform access to memory when
>>> CPUs cannot.
>>
>> So it means dev_to_node() will return either NUMA_NO_NODE or a
>> valid node id?
> 
> NUMA_NO_NODE := -1, which is not a valid node number. It is also, like I
> said, not a valid device location on a NUMA system.
> 
> Just because ACPI/BIOS is shit, doesn't mean the device doesn't have a
> node association. It just means we don't know and might have to guess.

How do we guess the device's location when ACPI/BIOS does not set it?

It seems dev_to_node() does not do anything about that and leave the
job to the caller or whatever function that get called with its return
value, such as cpumask_of_node().

> 
>>> 2) is already true today, cpumask_of_node() requires a valid node_id.
>>
>> Ok, most of the user does check node_id before calling
>> cpumask_of_node(), but does a little different type of checking:
>>
>> 1) some does " < 0" check;
>> 2) some does "= NUMA_NO_NODE" check;
>> 3) some does ">= MAX_NUMNODES" check;
>> 4) some does "< 0 || >= MAX_NUMNODES || !node_online(node)" check.
> 
> The one true way is:
> 
> 	'(unsigned)node_id >= nr_node_ids'

I missed the magic of the "unsigned" in your previous reply.

> 
>>> 3) is just wrong and increases overhead for everyone.
>>
>> Ok, cpumask_of_node() is also used in some critical path such
>> as scheduling, which may not need those checking, the overhead
>> is unnecessary.
>>
>> But for non-critical path such as setup or configuration path,
>> it better to have consistent checking, and also simplify the
>> user code that calls cpumask_of_node().
>>
>> Do you think it is worth the trouble to add a new function
>> such as cpumask_of_node_check(maybe some other name) to do
>> consistent checking?
>>
>> Or caller just simply check if dev_to_node()'s return value is
>> NUMA_NO_NODE before calling cpumask_of_node()?
> 
> It is not a matter of convenience. The function is called
> cpumask_of_node(), when node < 0 || node >= nr_node_ids, it is not a
> valid node, therefore the function shouldn't return anything except an
> error.
what do you mean by error? What I can think is three type of errors:
1) return NULL, this way it seems cpumask_of_node() also leave the
   job to the function that calls it.
2) cpu_none_mask, I am not sure what this means, maybe it means there
   is no cpu on the same node with the device?
3) give a warning, stack dump, or even a BUG_ON?

I would prefer the second one, and implement the third one when the
CONFIG_DEBUG_PER_CPU_MAPS is selected.

Any suggestion?

> 
> Also note that the CONFIG_DEBUG_PER_CPU_MAPS version of
> cpumask_of_node() already does this (although it wants the below fix).

Thanks for the note and example.

WARNING: multiple messages have this Message-ID (diff)
From: Yunsheng Lin <linyunsheng@huawei.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: <dalias@libc.org>, <linux-sh@vger.kernel.org>,
	<catalin.marinas@arm.com>, <dave.hansen@linux.intel.com>,
	<heiko.carstens@de.ibm.com>, <linuxarm@huawei.com>,
	<jiaxun.yang@flygoat.com>, <linux-kernel@vger.kernel.org>,
	<mwb@linux.vnet.ibm.com>, <paulus@samba.org>, <hpa@zytor.com>,
	<sparclinux@vger.kernel.org>, <chenhc@lemote.com>,
	<will@kernel.org>, <linux-s390@vger.kernel.org>,
	<ysato@users.sourceforge.jp>, <mpe@ellerman.id.au>,
	<x86@kernel.org>, <rppt@linux.ibm.com>, <borntraeger@de.ibm.com>,
	<dledford@redhat.com>, <mingo@redhat.com>,
	<jeffrey.t.kirsher@intel.com>, <benh@kernel.crashing.org>,
	<jhogan@kernel.org>, <nfont@linux.vnet.ibm.com>,
	<mattst88@gmail.com>, <len.brown@intel.com>, <gor@linux.ibm.com>,
	<anshuman.khandual@arm.com>, <ink@jurassic.park.msu.ru>,
	<cai@lca.pw>, <luto@kernel.org>, <tglx@linutronix.de>,
	<naveen.n.rao@linux.vnet.ibm.com>,
	<linux-arm-kernel@lists.infradead.org>, <rth@twiddle.net>,
	<axboe@kernel.dk>, <robin.murphy@arm.com>,
	<linux-mips@vger.kernel.org>, <ralf@linux-mips.org>,
	<tbogendoerfer@suse.de>, <paul.burton@mips.com>,
	<linux-alpha@vger.kernel.org>, <bp@alien8.de>,
	<akpm@linux-foundation.org>, <linuxppc-dev@lists.ozlabs.org>,
	<davem@davemloft.net>
Subject: Re: [PATCH v2 2/9] x86: numa: check the node id consistently for x86
Date: Mon, 2 Sep 2019 20:25:24 +0800	[thread overview]
Message-ID: <5fa2aa99-89fa-cd41-b090-36a23cfdeb73@huawei.com> (raw)
In-Reply-To: <20190902072542.GN2369@hirez.programming.kicks-ass.net>

On 2019/9/2 15:25, Peter Zijlstra wrote:
> On Mon, Sep 02, 2019 at 01:46:51PM +0800, Yunsheng Lin wrote:
>> On 2019/9/1 0:12, Peter Zijlstra wrote:
> 
>>> 1) because even it is not set, the device really does belong to a node.
>>> It is impossible a device will have magic uniform access to memory when
>>> CPUs cannot.
>>
>> So it means dev_to_node() will return either NUMA_NO_NODE or a
>> valid node id?
> 
> NUMA_NO_NODE := -1, which is not a valid node number. It is also, like I
> said, not a valid device location on a NUMA system.
> 
> Just because ACPI/BIOS is shit, doesn't mean the device doesn't have a
> node association. It just means we don't know and might have to guess.

How do we guess the device's location when ACPI/BIOS does not set it?

It seems dev_to_node() does not do anything about that and leave the
job to the caller or whatever function that get called with its return
value, such as cpumask_of_node().

> 
>>> 2) is already true today, cpumask_of_node() requires a valid node_id.
>>
>> Ok, most of the user does check node_id before calling
>> cpumask_of_node(), but does a little different type of checking:
>>
>> 1) some does " < 0" check;
>> 2) some does "== NUMA_NO_NODE" check;
>> 3) some does ">= MAX_NUMNODES" check;
>> 4) some does "< 0 || >= MAX_NUMNODES || !node_online(node)" check.
> 
> The one true way is:
> 
> 	'(unsigned)node_id >= nr_node_ids'

I missed the magic of the "unsigned" in your previous reply.

> 
>>> 3) is just wrong and increases overhead for everyone.
>>
>> Ok, cpumask_of_node() is also used in some critical path such
>> as scheduling, which may not need those checking, the overhead
>> is unnecessary.
>>
>> But for non-critical path such as setup or configuration path,
>> it better to have consistent checking, and also simplify the
>> user code that calls cpumask_of_node().
>>
>> Do you think it is worth the trouble to add a new function
>> such as cpumask_of_node_check(maybe some other name) to do
>> consistent checking?
>>
>> Or caller just simply check if dev_to_node()'s return value is
>> NUMA_NO_NODE before calling cpumask_of_node()?
> 
> It is not a matter of convenience. The function is called
> cpumask_of_node(), when node < 0 || node >= nr_node_ids, it is not a
> valid node, therefore the function shouldn't return anything except an
> error.
what do you mean by error? What I can think is three type of errors:
1) return NULL, this way it seems cpumask_of_node() also leave the
   job to the function that calls it.
2) cpu_none_mask, I am not sure what this means, maybe it means there
   is no cpu on the same node with the device?
3) give a warning, stack dump, or even a BUG_ON?

I would prefer the second one, and implement the third one when the
CONFIG_DEBUG_PER_CPU_MAPS is selected.

Any suggestion?

> 
> Also note that the CONFIG_DEBUG_PER_CPU_MAPS version of
> cpumask_of_node() already does this (although it wants the below fix).

Thanks for the note and example.


WARNING: multiple messages have this Message-ID (diff)
From: Yunsheng Lin <linyunsheng@huawei.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: dalias@libc.org, linux-sh@vger.kernel.org,
	catalin.marinas@arm.com, dave.hansen@linux.intel.com,
	heiko.carstens@de.ibm.com, linuxarm@huawei.com,
	jiaxun.yang@flygoat.com, linux-kernel@vger.kernel.org,
	mwb@linux.vnet.ibm.com, paulus@samba.org, hpa@zytor.com,
	sparclinux@vger.kernel.org, chenhc@lemote.com, will@kernel.org,
	linux-s390@vger.kernel.org, ysato@users.sourceforge.jp,
	mpe@ellerman.id.au, x86@kernel.org, rppt@linux.ibm.com,
	borntraeger@de.ibm.com, dledford@redhat.com, mingo@redhat.com,
	jeffrey.t.kirsher@intel.com, benh@kernel.crashing.org,
	jhogan@kernel.org, nfont@linux.vnet.ibm.com, mattst88@gmail.com,
	len.brown@intel.com, gor@linux.ibm.com,
	anshuman.khandual@arm.com, ink@jurassic.park.msu.ru, cai@lca.pw,
	luto@kernel.org, tglx@linutronix.de,
	naveen.n.rao@linux.vnet.ibm.com,
	linux-arm-kernel@lists.infradead.org, rth@twiddle.net,
	axboe@kernel.dk, robin.murphy@arm.com,
	linux-mips@vger.kernel.org, ralf@linux-mips.org,
	tbogendoerfer@suse.de, paul.burton@mips.com,
	linux-alpha@vger.kernel.org, bp@alien8.de,
	akpm@linux-foundation.org, linuxppc-dev@lists.ozlabs.org,
	davem@davemloft.net
Subject: Re: [PATCH v2 2/9] x86: numa: check the node id consistently for x86
Date: Mon, 2 Sep 2019 20:25:24 +0800	[thread overview]
Message-ID: <5fa2aa99-89fa-cd41-b090-36a23cfdeb73@huawei.com> (raw)
In-Reply-To: <20190902072542.GN2369@hirez.programming.kicks-ass.net>

On 2019/9/2 15:25, Peter Zijlstra wrote:
> On Mon, Sep 02, 2019 at 01:46:51PM +0800, Yunsheng Lin wrote:
>> On 2019/9/1 0:12, Peter Zijlstra wrote:
> 
>>> 1) because even it is not set, the device really does belong to a node.
>>> It is impossible a device will have magic uniform access to memory when
>>> CPUs cannot.
>>
>> So it means dev_to_node() will return either NUMA_NO_NODE or a
>> valid node id?
> 
> NUMA_NO_NODE := -1, which is not a valid node number. It is also, like I
> said, not a valid device location on a NUMA system.
> 
> Just because ACPI/BIOS is shit, doesn't mean the device doesn't have a
> node association. It just means we don't know and might have to guess.

How do we guess the device's location when ACPI/BIOS does not set it?

It seems dev_to_node() does not do anything about that and leave the
job to the caller or whatever function that get called with its return
value, such as cpumask_of_node().

> 
>>> 2) is already true today, cpumask_of_node() requires a valid node_id.
>>
>> Ok, most of the user does check node_id before calling
>> cpumask_of_node(), but does a little different type of checking:
>>
>> 1) some does " < 0" check;
>> 2) some does "== NUMA_NO_NODE" check;
>> 3) some does ">= MAX_NUMNODES" check;
>> 4) some does "< 0 || >= MAX_NUMNODES || !node_online(node)" check.
> 
> The one true way is:
> 
> 	'(unsigned)node_id >= nr_node_ids'

I missed the magic of the "unsigned" in your previous reply.

> 
>>> 3) is just wrong and increases overhead for everyone.
>>
>> Ok, cpumask_of_node() is also used in some critical path such
>> as scheduling, which may not need those checking, the overhead
>> is unnecessary.
>>
>> But for non-critical path such as setup or configuration path,
>> it better to have consistent checking, and also simplify the
>> user code that calls cpumask_of_node().
>>
>> Do you think it is worth the trouble to add a new function
>> such as cpumask_of_node_check(maybe some other name) to do
>> consistent checking?
>>
>> Or caller just simply check if dev_to_node()'s return value is
>> NUMA_NO_NODE before calling cpumask_of_node()?
> 
> It is not a matter of convenience. The function is called
> cpumask_of_node(), when node < 0 || node >= nr_node_ids, it is not a
> valid node, therefore the function shouldn't return anything except an
> error.
what do you mean by error? What I can think is three type of errors:
1) return NULL, this way it seems cpumask_of_node() also leave the
   job to the function that calls it.
2) cpu_none_mask, I am not sure what this means, maybe it means there
   is no cpu on the same node with the device?
3) give a warning, stack dump, or even a BUG_ON?

I would prefer the second one, and implement the third one when the
CONFIG_DEBUG_PER_CPU_MAPS is selected.

Any suggestion?

> 
> Also note that the CONFIG_DEBUG_PER_CPU_MAPS version of
> cpumask_of_node() already does this (although it wants the below fix).

Thanks for the note and example.

WARNING: multiple messages have this Message-ID (diff)
From: Yunsheng Lin <linyunsheng@huawei.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: dalias@libc.org, linux-sh@vger.kernel.org,
	catalin.marinas@arm.com, dave.hansen@linux.intel.com,
	heiko.carstens@de.ibm.com, linuxarm@huawei.com,
	jiaxun.yang@flygoat.com, linux-mips@vger.kernel.org,
	mwb@linux.vnet.ibm.com, paulus@samba.org, hpa@zytor.com,
	sparclinux@vger.kernel.org, chenhc@lemote.com, will@kernel.org,
	bp@alien8.de, linux-s390@vger.kernel.org,
	ysato@users.sourceforge.jp, x86@kernel.org, rppt@linux.ibm.com,
	borntraeger@de.ibm.com, dledford@redhat.com, mingo@redhat.com,
	jeffrey.t.kirsher@intel.com, jhogan@kernel.org,
	nfont@linux.vnet.ibm.com, mattst88@gmail.com,
	len.brown@intel.com, gor@linux.ibm.com,
	anshuman.khandual@arm.com, ink@jurassic.park.msu.ru,
	luto@kernel.org, tglx@linutronix.de,
	naveen.n.rao@linux.vnet.ibm.com,
	linux-arm-kernel@lists.infradead.org, rth@twiddle.net,
	axboe@kernel.dk, linuxppc-dev@lists.ozlabs.org,
	linux-kernel@vger.kernel.org, ralf@linux-mips.org,
	tbogendoerfer@suse.de, paul.burton@mips.com,
	linux-alpha@vger.kernel.org, cai@lca.pw,
	akpm@linux-foundation.org, robin.murphy@arm.com,
	davem@davemloft.net
Subject: Re: [PATCH v2 2/9] x86: numa: check the node id consistently for x86
Date: Mon, 2 Sep 2019 20:25:24 +0800	[thread overview]
Message-ID: <5fa2aa99-89fa-cd41-b090-36a23cfdeb73@huawei.com> (raw)
In-Reply-To: <20190902072542.GN2369@hirez.programming.kicks-ass.net>

On 2019/9/2 15:25, Peter Zijlstra wrote:
> On Mon, Sep 02, 2019 at 01:46:51PM +0800, Yunsheng Lin wrote:
>> On 2019/9/1 0:12, Peter Zijlstra wrote:
> 
>>> 1) because even it is not set, the device really does belong to a node.
>>> It is impossible a device will have magic uniform access to memory when
>>> CPUs cannot.
>>
>> So it means dev_to_node() will return either NUMA_NO_NODE or a
>> valid node id?
> 
> NUMA_NO_NODE := -1, which is not a valid node number. It is also, like I
> said, not a valid device location on a NUMA system.
> 
> Just because ACPI/BIOS is shit, doesn't mean the device doesn't have a
> node association. It just means we don't know and might have to guess.

How do we guess the device's location when ACPI/BIOS does not set it?

It seems dev_to_node() does not do anything about that and leave the
job to the caller or whatever function that get called with its return
value, such as cpumask_of_node().

> 
>>> 2) is already true today, cpumask_of_node() requires a valid node_id.
>>
>> Ok, most of the user does check node_id before calling
>> cpumask_of_node(), but does a little different type of checking:
>>
>> 1) some does " < 0" check;
>> 2) some does "== NUMA_NO_NODE" check;
>> 3) some does ">= MAX_NUMNODES" check;
>> 4) some does "< 0 || >= MAX_NUMNODES || !node_online(node)" check.
> 
> The one true way is:
> 
> 	'(unsigned)node_id >= nr_node_ids'

I missed the magic of the "unsigned" in your previous reply.

> 
>>> 3) is just wrong and increases overhead for everyone.
>>
>> Ok, cpumask_of_node() is also used in some critical path such
>> as scheduling, which may not need those checking, the overhead
>> is unnecessary.
>>
>> But for non-critical path such as setup or configuration path,
>> it better to have consistent checking, and also simplify the
>> user code that calls cpumask_of_node().
>>
>> Do you think it is worth the trouble to add a new function
>> such as cpumask_of_node_check(maybe some other name) to do
>> consistent checking?
>>
>> Or caller just simply check if dev_to_node()'s return value is
>> NUMA_NO_NODE before calling cpumask_of_node()?
> 
> It is not a matter of convenience. The function is called
> cpumask_of_node(), when node < 0 || node >= nr_node_ids, it is not a
> valid node, therefore the function shouldn't return anything except an
> error.
what do you mean by error? What I can think is three type of errors:
1) return NULL, this way it seems cpumask_of_node() also leave the
   job to the function that calls it.
2) cpu_none_mask, I am not sure what this means, maybe it means there
   is no cpu on the same node with the device?
3) give a warning, stack dump, or even a BUG_ON?

I would prefer the second one, and implement the third one when the
CONFIG_DEBUG_PER_CPU_MAPS is selected.

Any suggestion?

> 
> Also note that the CONFIG_DEBUG_PER_CPU_MAPS version of
> cpumask_of_node() already does this (although it wants the below fix).

Thanks for the note and example.


WARNING: multiple messages have this Message-ID (diff)
From: Yunsheng Lin <linyunsheng@huawei.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: dalias@libc.org, linux-sh@vger.kernel.org,
	catalin.marinas@arm.com, dave.hansen@linux.intel.com,
	heiko.carstens@de.ibm.com, linuxarm@huawei.com,
	jiaxun.yang@flygoat.com, linux-mips@vger.kernel.org,
	mwb@linux.vnet.ibm.com, paulus@samba.org, hpa@zytor.com,
	sparclinux@vger.kernel.org, chenhc@lemote.com, will@kernel.org,
	bp@alien8.de, linux-s390@vger.kernel.org,
	ysato@users.sourceforge.jp, mpe@ellerman.id.au, x86@kernel.org,
	rppt@linux.ibm.com, borntraeger@de.ibm.com, dledford@redhat.com,
	mingo@redhat.com, jeffrey.t.kirsher@intel.com,
	benh@kernel.crashing.org, jhogan@kernel.org,
	nfont@linux.vnet.ibm.com, mattst88@gmail.com,
	len.brown@intel.com, gor@linux.ibm.com,
	anshuman.khandual@arm.com, ink@jurassic.park.msu.ru,
	luto@kernel.org, tglx@linutronix.de,
	naveen.n.rao@linux.vnet.ibm.com,
	linux-arm-kernel@lists.infradead.org, rth@twiddle.net,
	axboe@kernel.dk, linuxppc-dev@lists.ozlabs.org,
	linux-kernel@vger.kernel.org, ralf@linux-mips.org,
	tbogendoerfer@suse.de, paul.burton@mips.com,
	linux-alpha@vger.kernel.org, cai@lca.pw,
	akpm@linux-foundation.org, robin.murphy@arm.com,
	davem@davemloft.net
Subject: Re: [PATCH v2 2/9] x86: numa: check the node id consistently for x86
Date: Mon, 2 Sep 2019 20:25:24 +0800	[thread overview]
Message-ID: <5fa2aa99-89fa-cd41-b090-36a23cfdeb73@huawei.com> (raw)
In-Reply-To: <20190902072542.GN2369@hirez.programming.kicks-ass.net>

On 2019/9/2 15:25, Peter Zijlstra wrote:
> On Mon, Sep 02, 2019 at 01:46:51PM +0800, Yunsheng Lin wrote:
>> On 2019/9/1 0:12, Peter Zijlstra wrote:
> 
>>> 1) because even it is not set, the device really does belong to a node.
>>> It is impossible a device will have magic uniform access to memory when
>>> CPUs cannot.
>>
>> So it means dev_to_node() will return either NUMA_NO_NODE or a
>> valid node id?
> 
> NUMA_NO_NODE := -1, which is not a valid node number. It is also, like I
> said, not a valid device location on a NUMA system.
> 
> Just because ACPI/BIOS is shit, doesn't mean the device doesn't have a
> node association. It just means we don't know and might have to guess.

How do we guess the device's location when ACPI/BIOS does not set it?

It seems dev_to_node() does not do anything about that and leave the
job to the caller or whatever function that get called with its return
value, such as cpumask_of_node().

> 
>>> 2) is already true today, cpumask_of_node() requires a valid node_id.
>>
>> Ok, most of the user does check node_id before calling
>> cpumask_of_node(), but does a little different type of checking:
>>
>> 1) some does " < 0" check;
>> 2) some does "== NUMA_NO_NODE" check;
>> 3) some does ">= MAX_NUMNODES" check;
>> 4) some does "< 0 || >= MAX_NUMNODES || !node_online(node)" check.
> 
> The one true way is:
> 
> 	'(unsigned)node_id >= nr_node_ids'

I missed the magic of the "unsigned" in your previous reply.

> 
>>> 3) is just wrong and increases overhead for everyone.
>>
>> Ok, cpumask_of_node() is also used in some critical path such
>> as scheduling, which may not need those checking, the overhead
>> is unnecessary.
>>
>> But for non-critical path such as setup or configuration path,
>> it better to have consistent checking, and also simplify the
>> user code that calls cpumask_of_node().
>>
>> Do you think it is worth the trouble to add a new function
>> such as cpumask_of_node_check(maybe some other name) to do
>> consistent checking?
>>
>> Or caller just simply check if dev_to_node()'s return value is
>> NUMA_NO_NODE before calling cpumask_of_node()?
> 
> It is not a matter of convenience. The function is called
> cpumask_of_node(), when node < 0 || node >= nr_node_ids, it is not a
> valid node, therefore the function shouldn't return anything except an
> error.
what do you mean by error? What I can think is three type of errors:
1) return NULL, this way it seems cpumask_of_node() also leave the
   job to the function that calls it.
2) cpu_none_mask, I am not sure what this means, maybe it means there
   is no cpu on the same node with the device?
3) give a warning, stack dump, or even a BUG_ON?

I would prefer the second one, and implement the third one when the
CONFIG_DEBUG_PER_CPU_MAPS is selected.

Any suggestion?

> 
> Also note that the CONFIG_DEBUG_PER_CPU_MAPS version of
> cpumask_of_node() already does this (although it wants the below fix).

Thanks for the note and example.


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Yunsheng Lin <linyunsheng@huawei.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: dalias@libc.org, linux-sh@vger.kernel.org,
	catalin.marinas@arm.com, dave.hansen@linux.intel.com,
	heiko.carstens@de.ibm.com, linuxarm@huawei.com,
	jiaxun.yang@flygoat.com, linux-mips@vger.kernel.org,
	mwb@linux.vnet.ibm.com, paulus@samba.org, hpa@zytor.com,
	sparclinux@vger.kernel.org, chenhc@lemote.com, will@kernel.org,
	bp@alien8.de, linux-s390@vger.kernel.org,
	ysato@users.sourceforge.jp, mpe@ellerman.id.au, x86@kernel.org,
	rppt@linux.ibm.com, borntraeger@de.ibm.com, dledford@redhat.com,
	mingo@redhat.com, jeffrey.t.kirsher@intel.com,
	benh@kernel.crashing.org, jhogan@kernel.org,
	nfont@linux.vnet.ibm.com, mattst88@gmail.com,
	len.brown@intel.com, gor@linux.ibm.com,
	anshuman.khandual@arm.com, ink@jurassic.park.msu.ru,
	luto@kernel.org, tglx@linutronix.de,
	naveen.n.rao@linux.vnet.ibm.com,
	linux-arm-kernel@lists.infradead.org, rth@twiddle.net,
	axboe@kernel.dk, linuxppc-dev@lists.oz
Subject: Re: [PATCH v2 2/9] x86: numa: check the node id consistently for x86
Date: Mon, 2 Sep 2019 20:25:24 +0800	[thread overview]
Message-ID: <5fa2aa99-89fa-cd41-b090-36a23cfdeb73@huawei.com> (raw)
In-Reply-To: <20190902072542.GN2369@hirez.programming.kicks-ass.net>

On 2019/9/2 15:25, Peter Zijlstra wrote:
> On Mon, Sep 02, 2019 at 01:46:51PM +0800, Yunsheng Lin wrote:
>> On 2019/9/1 0:12, Peter Zijlstra wrote:
> 
>>> 1) because even it is not set, the device really does belong to a node.
>>> It is impossible a device will have magic uniform access to memory when
>>> CPUs cannot.
>>
>> So it means dev_to_node() will return either NUMA_NO_NODE or a
>> valid node id?
> 
> NUMA_NO_NODE := -1, which is not a valid node number. It is also, like I
> said, not a valid device location on a NUMA system.
> 
> Just because ACPI/BIOS is shit, doesn't mean the device doesn't have a
> node association. It just means we don't know and might have to guess.

How do we guess the device's location when ACPI/BIOS does not set it?

It seems dev_to_node() does not do anything about that and leave the
job to the caller or whatever function that get called with its return
value, such as cpumask_of_node().

> 
>>> 2) is already true today, cpumask_of_node() requires a valid node_id.
>>
>> Ok, most of the user does check node_id before calling
>> cpumask_of_node(), but does a little different type of checking:
>>
>> 1) some does " < 0" check;
>> 2) some does "== NUMA_NO_NODE" check;
>> 3) some does ">= MAX_NUMNODES" check;
>> 4) some does "< 0 || >= MAX_NUMNODES || !node_online(node)" check.
> 
> The one true way is:
> 
> 	'(unsigned)node_id >= nr_node_ids'

I missed the magic of the "unsigned" in your previous reply.

> 
>>> 3) is just wrong and increases overhead for everyone.
>>
>> Ok, cpumask_of_node() is also used in some critical path such
>> as scheduling, which may not need those checking, the overhead
>> is unnecessary.
>>
>> But for non-critical path such as setup or configuration path,
>> it better to have consistent checking, and also simplify the
>> user code that calls cpumask_of_node().
>>
>> Do you think it is worth the trouble to add a new function
>> such as cpumask_of_node_check(maybe some other name) to do
>> consistent checking?
>>
>> Or caller just simply check if dev_to_node()'s return value is
>> NUMA_NO_NODE before calling cpumask_of_node()?
> 
> It is not a matter of convenience. The function is called
> cpumask_of_node(), when node < 0 || node >= nr_node_ids, it is not a
> valid node, therefore the function shouldn't return anything except an
> error.
what do you mean by error? What I can think is three type of errors:
1) return NULL, this way it seems cpumask_of_node() also leave the
   job to the function that calls it.
2) cpu_none_mask, I am not sure what this means, maybe it means there
   is no cpu on the same node with the device?
3) give a warning, stack dump, or even a BUG_ON?

I would prefer the second one, and implement the third one when the
CONFIG_DEBUG_PER_CPU_MAPS is selected.

Any suggestion?

> 
> Also note that the CONFIG_DEBUG_PER_CPU_MAPS version of
> cpumask_of_node() already does this (although it wants the below fix).

Thanks for the note and example.

  reply	other threads:[~2019-09-02 12:25 UTC|newest]

Thread overview: 164+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-31  5:58 [PATCH v2 0/9] check the node id consistently across different arches Yunsheng Lin
2019-08-31  5:58 ` Yunsheng Lin
2019-08-31  5:58 ` Yunsheng Lin
2019-08-31  5:58 ` Yunsheng Lin
2019-08-31  5:58 ` Yunsheng Lin
2019-08-31  5:58 ` Yunsheng Lin
2019-08-31  5:58 ` [PATCH v2 1/9] arm64: numa: check the node id consistently for arm64 Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58 ` [PATCH v2 2/9] x86: numa: check the node id consistently for x86 Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  8:55   ` Peter Zijlstra
2019-08-31  8:55     ` Peter Zijlstra
2019-08-31  8:55     ` Peter Zijlstra
2019-08-31  8:55     ` Peter Zijlstra
2019-08-31  8:55     ` Peter Zijlstra
2019-08-31 10:09     ` Yunsheng Lin
2019-08-31 10:09       ` Yunsheng Lin
2019-08-31 10:09       ` Yunsheng Lin
2019-08-31 10:09       ` Yunsheng Lin
2019-08-31 10:09       ` Yunsheng Lin
2019-08-31 10:09       ` Yunsheng Lin
2019-08-31 16:12       ` Peter Zijlstra
2019-08-31 16:12         ` Peter Zijlstra
2019-08-31 16:12         ` Peter Zijlstra
2019-08-31 16:12         ` Peter Zijlstra
2019-08-31 16:12         ` Peter Zijlstra
2019-09-01  4:45         ` Something about loongson_llsc_mb 陈华才
     [not found]           ` <2019090410032559707512@loongson.cn>
2019-09-04  9:21             ` Peter Zijlstra
2019-09-04 10:04               ` Peter Zijlstra
2019-09-04 12:57               ` Huang Pei
2019-09-02  5:46         ` [PATCH v2 2/9] x86: numa: check the node id consistently for x86 Yunsheng Lin
2019-09-02  5:46           ` Yunsheng Lin
2019-09-02  5:46           ` Yunsheng Lin
2019-09-02  5:46           ` Yunsheng Lin
2019-09-02  5:46           ` Yunsheng Lin
2019-09-02  5:46           ` Yunsheng Lin
2019-09-02  7:25           ` Peter Zijlstra
2019-09-02  7:25             ` Peter Zijlstra
2019-09-02  7:25             ` Peter Zijlstra
2019-09-02  7:25             ` Peter Zijlstra
2019-09-02  7:25             ` Peter Zijlstra
2019-09-02 12:25             ` Yunsheng Lin [this message]
2019-09-02 12:25               ` Yunsheng Lin
2019-09-02 12:25               ` Yunsheng Lin
2019-09-02 12:25               ` Yunsheng Lin
2019-09-02 12:25               ` Yunsheng Lin
2019-09-02 12:25               ` Yunsheng Lin
2019-09-02 12:56               ` Peter Zijlstra
2019-09-02 12:56                 ` Peter Zijlstra
2019-09-02 12:56                 ` Peter Zijlstra
2019-09-02 12:56                 ` Peter Zijlstra
2019-09-02 18:22                 ` Ingo Molnar
2019-09-02 18:22                   ` Ingo Molnar
2019-09-02 18:22                   ` Ingo Molnar
2019-09-02 18:22                   ` Ingo Molnar
2019-09-02 19:14                   ` Peter Zijlstra
2019-09-02 19:14                     ` Peter Zijlstra
2019-09-02 19:14                     ` Peter Zijlstra
2019-09-02 19:14                     ` Peter Zijlstra
2019-09-03  6:19                 ` Yunsheng Lin
2019-09-03  6:19                   ` Yunsheng Lin
2019-09-03  6:19                   ` Yunsheng Lin
2019-09-03  6:19                   ` Yunsheng Lin
2019-09-03  6:19                   ` Yunsheng Lin
2019-09-03  7:11                   ` Peter Zijlstra
2019-09-03  7:11                     ` Peter Zijlstra
2019-09-03  7:11                     ` Peter Zijlstra
2019-09-03  7:11                     ` Peter Zijlstra
2019-09-03  8:31                     ` Yunsheng Lin
2019-09-03  8:31                       ` Yunsheng Lin
2019-09-03  8:31                       ` Yunsheng Lin
2019-09-03  8:31                       ` Yunsheng Lin
2019-09-03  8:31                       ` Yunsheng Lin
2019-09-03 12:15                     ` Salil Mehta
2019-09-03 12:15                       ` Salil Mehta
2019-09-03 14:28                       ` Peter Zijlstra
2019-09-03 14:28                         ` Peter Zijlstra
2019-09-02 18:17             ` Ingo Molnar
2019-09-02 18:17               ` Ingo Molnar
2019-09-02 18:17               ` Ingo Molnar
2019-09-02 18:17               ` Ingo Molnar
2019-09-03  7:53               ` [PATCH] x86/mm: Fix cpumask_of_node() error condition Peter Zijlstra
2019-09-03  7:53                 ` Peter Zijlstra
2019-09-03  7:53                 ` Peter Zijlstra
2019-09-03  7:53                 ` Peter Zijlstra
2019-08-31  5:58 ` [PATCH v2 3/9] alpha: numa: check the node id consistently for alpha Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58 ` [PATCH v2 4/9] powerpc: numa: check the node id consistently for powerpc Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58 ` [PATCH v2 5/9] s390: numa: check the node id consistently for s390 Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-09-02  4:05   ` kbuild test robot
2019-09-02  4:05     ` kbuild test robot
2019-09-02  4:05     ` kbuild test robot
2019-09-02  4:05     ` kbuild test robot
2019-08-31  5:58 ` [PATCH v2 6/9] sh: numa: check the node id consistently for sh Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58 ` [PATCH v2 7/9] sparc64: numa: check the node id consistently for sparc64 Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  6:53   ` David Miller
2019-08-31  6:53     ` David Miller
2019-08-31  6:53     ` David Miller
2019-08-31  6:53     ` David Miller
2019-08-31  6:53     ` David Miller
2019-08-31  8:57     ` Yunsheng Lin
2019-08-31  8:57       ` Yunsheng Lin
2019-08-31  8:57       ` Yunsheng Lin
2019-08-31  8:57       ` Yunsheng Lin
2019-08-31  8:57       ` Yunsheng Lin
2019-08-31  8:57       ` Yunsheng Lin
2019-08-31 20:02       ` David Miller
2019-08-31 20:02         ` David Miller
2019-08-31 20:02         ` David Miller
2019-08-31 20:02         ` David Miller
2019-08-31 20:02         ` David Miller
2019-09-02  6:08         ` Yunsheng Lin
2019-09-02  6:08           ` Yunsheng Lin
2019-09-02  6:08           ` Yunsheng Lin
2019-09-02  6:08           ` Yunsheng Lin
2019-09-02  6:08           ` Yunsheng Lin
2019-09-02  6:08           ` Yunsheng Lin
2019-09-02 15:17           ` David Miller
2019-09-02 15:17             ` David Miller
2019-09-02 15:17             ` David Miller
2019-09-02 15:17             ` David Miller
2019-09-02 15:17             ` David Miller
2019-08-31  5:58 ` [PATCH v2 8/9] mips: numa: check the node id consistently for mips ip27 Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31 15:45   ` Paul Burton
2019-08-31 15:45     ` Paul Burton
2019-08-31 15:45     ` Paul Burton
2019-09-02  6:11     ` Yunsheng Lin
2019-09-02  6:11       ` Yunsheng Lin
2019-09-02  6:11       ` Yunsheng Lin
2019-08-31  5:58 ` [PATCH v2 9/9] mips: numa: check the node id consistently for mips loongson64 Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin
2019-08-31  5:58   ` Yunsheng Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5fa2aa99-89fa-cd41-b090-36a23cfdeb73@huawei.com \
    --to=linyunsheng@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=axboe@kernel.dk \
    --cc=benh@kernel.crashing.org \
    --cc=borntraeger@de.ibm.com \
    --cc=bp@alien8.de \
    --cc=cai@lca.pw \
    --cc=catalin.marinas@arm.com \
    --cc=chenhc@lemote.com \
    --cc=dalias@libc.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=dledford@redhat.com \
    --cc=gor@linux.ibm.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=hpa@zytor.com \
    --cc=ink@jurassic.park.msu.ru \
    --cc=jeffrey.t.kirsher@intel.com \
    --cc=jhogan@kernel.org \
    --cc=jiaxun.yang@flygoat.com \
    --cc=len.brown@intel.com \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=luto@kernel.org \
    --cc=mattst88@gmail.com \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=mwb@linux.vnet.ibm.com \
    --cc=naveen.n.rao@linux.vnet.ibm.com \
    --cc=nfont@linux.vnet.ibm.com \
    --cc=paul.burton@mips.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=ralf@linux-mips.org \
    --cc=robin.murphy@arm.com \
    --cc=rppt@linux.ibm.com \
    --cc=rth@twiddle.net \
    --cc=sparclinux@vger.kernel.org \
    --cc=tbogendoerfer@suse.de \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=ysato@users.sourceforge.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.