linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yunsheng Lin <linyunsheng@huawei.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Greg KH <gregkh@linuxfoundation.org>, <rafael@kernel.org>,
	<linux-kernel@vger.kernel.org>, <peterz@infradead.org>,
	<mingo@kernel.org>, <linuxarm@huawei.com>
Subject: Re: [PATCH] driver core: ensure a device has valid node id in device_add()
Date: Wed, 11 Sep 2019 15:22:30 +0800	[thread overview]
Message-ID: <3b977388-5f25-d0b5-bdc9-f963a9be2bd1@huawei.com> (raw)
In-Reply-To: <20190911064926.GJ4023@dhcp22.suse.cz>

On 2019/9/11 14:49, Michal Hocko wrote:
> On Wed 11-09-19 14:15:51, Yunsheng Lin wrote:
>> On 2019/9/11 13:33, Michal Hocko wrote:
>>> On Tue 10-09-19 14:53:39, Michal Hocko wrote:
>>>> On Tue 10-09-19 20:47:40, Yunsheng Lin wrote:
>>>>> On 2019/9/10 19:12, Greg KH wrote:
>>>>>> On Tue, Sep 10, 2019 at 01:04:51PM +0200, Michal Hocko wrote:
>>>>>>> On Tue 10-09-19 18:58:05, Yunsheng Lin wrote:
>>>>>>>> On 2019/9/10 17:31, Greg KH wrote:
>>>>>>>>> On Tue, Sep 10, 2019 at 02:43:32PM +0800, Yunsheng Lin wrote:
>>>>>>>>>> On 2019/9/9 17:53, Greg KH wrote:
>>>>>>>>>>> On Mon, Sep 09, 2019 at 02:04:23PM +0800, Yunsheng Lin wrote:
>>>>>>>>>>>> Currently a device does not belong to any of the numa nodes
>>>>>>>>>>>> (dev->numa_node is NUMA_NO_NODE) when the node id is neither
>>>>>>>>>>>> specified by fw nor by virtual device layer and the device has
>>>>>>>>>>>> no parent device.
>>>>>>>>>>>
>>>>>>>>>>> Is this really a problem?
>>>>>>>>>>
>>>>>>>>>> Not really.
>>>>>>>>>> Someone need to guess the node id when it is not specified, right?
>>>>>>>>>
>>>>>>>>> No, why?  Guessing guarantees you will get it wrong on some systems.
>>>>>>>>>
>>>>>>>>> Are you seeing real problems because the id is not being set?  What
>>>>>>>>> problem is this fixing that you can actually observe?
>>>>>>>>
>>>>>>>> When passing the return value of dev_to_node() to cpumask_of_node()
>>>>>>>> without checking the node id if the node id is not valid, there is
>>>>>>>> global-out-of-bounds detected by KASAN as below:
>>>>>>>
>>>>>>> OK, I seem to remember this being brought up already. And now when I
>>>>>>> think about it, we really want to make cpumask_of_node NUMA_NO_NODE
>>>>>>> aware. That means using the same trick the allocator does for this
>>>>>>> special case.
>>>>>>
>>>>>> That seems reasonable to me, and much more "obvious" as to what is going
>>>>>> on.
>>>>>>
>>>>>
>>>>> Ok, thanks for the suggestion.
>>>>>
>>>>> For arm64 and x86, there are two versions of cpumask_of_node().
>>>>>
>>>>> when CONFIG_DEBUG_PER_CPU_MAPS is defined, the cpumask_of_node()
>>>>>    in arch/x86/mm/numa.c is used, which does partial node id checking:
>>>>>
>>>>> const struct cpumask *cpumask_of_node(int node)
>>>>> {
>>>>>         if (node >= nr_node_ids) {
>>>>>                 printk(KERN_WARNING
>>>>>                         "cpumask_of_node(%d): node > nr_node_ids(%u)\n",
>>>>>                         node, nr_node_ids);
>>>>>                 dump_stack();
>>>>>                 return cpu_none_mask;
>>>>>         }
>>>>>         if (node_to_cpumask_map[node] == NULL) {
>>>>>                 printk(KERN_WARNING
>>>>>                         "cpumask_of_node(%d): no node_to_cpumask_map!\n",
>>>>>                         node);
>>>>>                 dump_stack();
>>>>>                 return cpu_online_mask;
>>>>>         }
>>>>>         return node_to_cpumask_map[node];
>>>>> }
>>>>>
>>>>> when CONFIG_DEBUG_PER_CPU_MAPS is undefined, the cpumask_of_node()
>>>>>    in arch/x86/include/asm/topology.h is used:
>>>>>
>>>>> static inline const struct cpumask *cpumask_of_node(int node)
>>>>> {
>>>>>         return node_to_cpumask_map[node];
>>>>> }
>>>>
>>>> I would simply go with. There shouldn't be any need for heavy weight
>>>> checks that CONFIG_DEBUG_PER_CPU_MAPS has.
>>>>
>>>> static inline const struct cpumask *cpumask_of_node(int node)
>>>> {
>>>> 	/* A nice comment goes here */
>>>> 	if (node == NUMA_NO_NODE)
>>
>> How about "(unsigned int)node >= nr_node_ids", this is suggested
>> by Peter, it checks the case where the node id set by fw is bigger
>> or equal than nr_node_ids, and still handle the < 0 case, which
>> includes NUMA_NO_NODE.
> 
> Isn't that a plain bug? Is something like that really happening?

I have not seen one happened before except the NUMA_NO_NODE case.
Even with NUMA_NO_NODE case, we did not see it until we turn on
the KASAN detection.

It seems that there is no protection that prevent setting the node
of device to an invalid node.
And the kernel does have a few different check now:
1) some does " < 0" check;
2) some does "== NUMA_NO_NODE" check;
3) some does ">= MAX_NUMNODES" check;
4) some does "< 0 || >= MAX_NUMNODES || !node_online(node)" check.

We need to be consistent about the checking, right?


  reply	other threads:[~2019-09-11  7:23 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-09  6:04 [PATCH] driver core: ensure a device has valid node id in device_add() Yunsheng Lin
2019-09-09  9:53 ` Greg KH
2019-09-10  6:43   ` Yunsheng Lin
2019-09-10  7:13     ` Michal Hocko
2019-09-10  9:31     ` Greg KH
2019-09-10 10:58       ` Yunsheng Lin
2019-09-10 11:04         ` Michal Hocko
2019-09-10 11:12           ` Greg KH
2019-09-10 12:47             ` Yunsheng Lin
2019-09-10 12:53               ` Michal Hocko
2019-09-11  5:33                 ` Michal Hocko
2019-09-11  6:15                   ` Yunsheng Lin
2019-09-11  6:49                     ` Michal Hocko
2019-09-11  7:22                       ` Yunsheng Lin [this message]
2019-09-11  7:34                         ` Michal Hocko
2019-09-11 11:03                           ` Yunsheng Lin
2019-09-11 11:41                             ` Yunsheng Lin
2019-09-11 12:02                               ` Michal Hocko
2019-09-23 15:09                       ` Peter Zijlstra
2019-09-09 18:50 ` Michal Hocko
2019-09-10  7:08   ` Yunsheng Lin
2019-09-10  7:24     ` Michal Hocko
2019-09-10 10:40       ` Yunsheng Lin
2019-09-10 11:01         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3b977388-5f25-d0b5-bdc9-f963a9be2bd1@huawei.com \
    --to=linyunsheng@huawei.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=mhocko@kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).