From: Yunsheng Lin <linyunsheng@huawei.com>
To: Greg KH <gregkh@linuxfoundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Michal Hocko <mhocko@kernel.org>,
Robin Murphy <robin.murphy@arm.com>, <catalin.marinas@arm.com>,
<will@kernel.org>, <mingo@redhat.com>, <bp@alien8.de>,
<rth@twiddle.net>, <ink@jurassic.park.msu.ru>,
<mattst88@gmail.com>, <benh@kernel.crashing.org>,
<paulus@samba.org>, <mpe@ellerman.id.au>,
<heiko.carstens@de.ibm.com>, <gor@linux.ibm.com>,
<borntraeger@de.ibm.com>, <ysato@users.sourceforge.jp>,
<dalias@libc.org>, <davem@davemloft.net>, <ralf@linux-mips.org>,
<paul.burton@mips.com>, <jhogan@kernel.org>,
<jiaxun.yang@flygoat.com>, <chenhc@lemote.com>,
<akpm@linux-foundation.org>, <rppt@linux.ibm.com>,
<anshuman.khandual@arm.com>, <tglx@linutronix.de>, <cai@lca.pw>,
<linux-arm-kernel@lists.infradead.org>,
<linux-kernel@vger.kernel.org>, <hpa@zytor.com>, <x86@kernel.org>,
<dave.hansen@linux.intel.com>, <luto@kernel.org>,
<len.brown@intel.com>, <axboe@kernel.dk>, <dledford@redhat.com>,
<jeffrey.t.kirsher@intel.com>, <linux-alpha@vger.kernel.org>,
<naveen.n.rao@linux.vnet.ibm.com>, <mwb@linux.vnet.ibm.com>,
<linuxppc-dev@lists.ozlabs.org>, <linux-s390@vger.kernel.org>,
<linux-sh@vger.kernel.org>, <sparclinux@vger.kernel.org>,
<tbogendoerfer@suse.de>, <linux-mips@vger.kernel.org>,
<rafael@kernel.org>, <bhelgaas@google.com>,
<linux-pci@vger.kernel.org>,
"Rafael J. Wysocki" <rjw@rjwysocki.net>, <lenb@kernel.org>,
<linux-acpi@vger.kernel.org>
Subject: Re: [PATCH v6] numa: make node_to_cpumask_map() NUMA_NO_NODE aware
Date: Mon, 14 Oct 2019 16:00:46 +0800 [thread overview]
Message-ID: <82000bc8-6912-205b-0251-25b9cc430973@huawei.com> (raw)
In-Reply-To: <20191012104742.GA2053473@kroah.com>
On 2019/10/12 18:47, Greg KH wrote:
> On Sat, Oct 12, 2019 at 12:40:01PM +0200, Greg KH wrote:
>> On Sat, Oct 12, 2019 at 05:47:56PM +0800, Yunsheng Lin wrote:
>>> On 2019/10/12 15:40, Greg KH wrote:
>>>> On Sat, Oct 12, 2019 at 02:17:26PM +0800, Yunsheng Lin wrote:
>>>>> add pci and acpi maintainer
>>>>> cc linux-pci@vger.kernel.org and linux-acpi@vger.kernel.org
>>>>>
>>>>> On 2019/10/11 19:15, Peter Zijlstra wrote:
>>>>>> On Fri, Oct 11, 2019 at 11:27:54AM +0800, Yunsheng Lin wrote:
>>>>>>> But I failed to see why the above is related to making node_to_cpumask_map()
>>>>>>> NUMA_NO_NODE aware?
>>>>>>
>>>>>> Your initial bug is for hns3, which is a PCI device, which really _MUST_
>>>>>> have a node assigned.
>>>>>>
>>>>>> It not having one, is a straight up bug. We must not silently accept
>>>>>> NO_NODE there, ever.
>>>>>>
>>>>>
>>>>> I suppose you mean reporting a lack of affinity when the node of a pcie
>>>>> device is not set by "not silently accept NO_NODE".
>>>>
>>>> If the firmware of a pci device does not provide the node information,
>>>> then yes, warn about that.
>>>>
>>>>> As Greg has asked about in [1]:
>>>>> what is a user to do when the user sees the kernel reporting that?
>>>>>
>>>>> We may tell user to contact their vendor for info or updates about
>>>>> that when they do not know about their system well enough, but their
>>>>> vendor may get away with this by quoting ACPI spec as the spec
>>>>> considering this optional. Should the user believe this is indeed a
>>>>> fw bug or a misreport from the kernel?
>>>>
>>>> Say it is a firmware bug, if it is a firmware bug, that's simple.
>>>>
>>>>> If this kind of reporting is common pratice and will not cause any
>>>>> misunderstanding, then maybe we can report that.
>>>>
>>>> Yes, please do so, that's the only way those boxes are ever going to get
>>>> fixed. And go add the test to the "firmware testing" tool that is based
>>>> on Linux that Intel has somewhere, to give vendors a chance to fix this
>>>> before they ship hardware.
>>>>
>>>> This shouldn't be a big deal, we warn of other hardware bugs all the
>>>> time.
>>>
>>> Ok, thanks for clarifying.
>>>
>>> Will send a patch to catch the case when a pcie device without numa node
>>> being set and warn about it.
>>>
>>> Maybe use dev->bus to verify if it is a pci device?
>>
>> No, do that in the pci bus core code itself, when creating the devices
>> as that is when you know, or do not know, the numa node, right?
>>
>> This can't be in the driver core only, as each bus type will have a
>> different way of determining what the node the device is on. For some
>> reason, I thought the PCI core code already does this, right?
>
> Yes, pci_irq_get_node(), which NO ONE CALLS! I should go delete that
> thing...
>
> Anyway, it looks like the pci core code does call set_dev_node() based
> on the PCI bridge, so if that is set up properly, all should be fine.
>
> If not, well, you have buggy firmware and you need to warn about that at
> the time you are creating the bridge. Look at the call to
> pcibus_to_node() in pci_register_host_bridge().
Thanks for pointing out the specific function.
Maybe we do not need to warn about the case when the device has a parent,
because we must have warned about the parent if the device has a parent
and the parent also has a node of NO_NODE, so do not need to warn the child
device anymore? like blew:
@@ -932,6 +932,10 @@ static int pci_register_host_bridge(struct pci_host_bridge *bridge)
list_add_tail(&bus->node, &pci_root_buses);
up_write(&pci_bus_sem);
+ if (nr_node_ids > 1 && !parent &&
+ dev_to_node(bus->bridge) == NUMA_NO_NODE)
+ dev_err(bus->bridge, FW_BUG "No node assigned on NUMA capable HW. Please contact your vendor for updates.\n");
+
return 0;
Also, we do not need to warn about that in pci_device_add(), Right?
Because we must have warned about the pci host bridge of the pci device.
I may be wrong about above because I am not so familiar with the pci.
>
> And yes, you need to do this all on a per-bus-type basis, as has been
> pointed out. It's up to the bus to create the device and set this up
> properly.
Thanks.
Will do that on per-bus-type basis.
>
> thanks,
>
> greg k-h
>
> .
>
next prev parent reply other threads:[~2019-10-14 8:00 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-17 12:48 [PATCH v6] numa: make node_to_cpumask_map() NUMA_NO_NODE aware Yunsheng Lin
2019-09-23 15:15 ` Peter Zijlstra
2019-09-23 15:28 ` Michal Hocko
2019-09-23 15:48 ` Peter Zijlstra
2019-09-23 16:52 ` Michal Hocko
2019-09-23 20:34 ` Peter Zijlstra
2019-09-24 1:29 ` Yunsheng Lin
2019-09-24 9:25 ` Peter Zijlstra
2019-09-24 11:07 ` Yunsheng Lin
2019-09-24 11:28 ` Peter Zijlstra
2019-09-24 11:44 ` Yunsheng Lin
2019-09-24 11:58 ` Peter Zijlstra
2019-09-24 12:09 ` Yunsheng Lin
2019-09-24 7:47 ` Michal Hocko
2019-09-24 9:17 ` Peter Zijlstra
2019-09-24 10:56 ` Michal Hocko
2019-09-24 11:23 ` Peter Zijlstra
2019-09-24 11:54 ` Michal Hocko
2019-09-24 12:09 ` Peter Zijlstra
2019-09-24 12:25 ` Michal Hocko
2019-09-24 12:43 ` Peter Zijlstra
2019-09-24 12:59 ` Peter Zijlstra
2019-09-24 13:19 ` Michal Hocko
2019-09-25 9:14 ` Yunsheng Lin
2019-09-25 10:41 ` Peter Zijlstra
2019-10-08 8:38 ` Yunsheng Lin
2019-10-09 12:25 ` Robin Murphy
2019-10-10 6:07 ` Yunsheng Lin
2019-10-10 7:32 ` Michal Hocko
2019-10-11 3:27 ` Yunsheng Lin
2019-10-11 11:15 ` Peter Zijlstra
2019-10-12 6:17 ` Yunsheng Lin
2019-10-12 7:40 ` Greg KH
2019-10-12 9:47 ` Yunsheng Lin
2019-10-12 10:40 ` Greg KH
2019-10-12 10:47 ` Greg KH
2019-10-14 8:00 ` Yunsheng Lin [this message]
2019-10-14 9:25 ` Greg KH
2019-10-14 9:49 ` Peter Zijlstra
2019-10-14 10:04 ` Greg KH
2019-10-15 10:40 ` Yunsheng Lin
2019-10-15 16:58 ` Greg KH
2019-10-16 12:07 ` Yunsheng Lin
2019-10-28 9:20 ` Yunsheng Lin
2019-10-29 8:53 ` Michal Hocko
2019-10-30 1:58 ` Yunsheng Lin
2019-10-10 8:56 ` Peter Zijlstra
2019-09-25 10:40 ` Peter Zijlstra
2019-09-25 13:25 ` Michal Hocko
2019-09-25 16:31 ` Peter Zijlstra
2019-09-25 21:45 ` Peter Zijlstra
2019-09-26 9:05 ` Peter Zijlstra
2019-09-26 12:10 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=82000bc8-6912-205b-0251-25b9cc430973@huawei.com \
--to=linyunsheng@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=axboe@kernel.dk \
--cc=benh@kernel.crashing.org \
--cc=bhelgaas@google.com \
--cc=borntraeger@de.ibm.com \
--cc=bp@alien8.de \
--cc=cai@lca.pw \
--cc=catalin.marinas@arm.com \
--cc=chenhc@lemote.com \
--cc=dalias@libc.org \
--cc=dave.hansen@linux.intel.com \
--cc=davem@davemloft.net \
--cc=dledford@redhat.com \
--cc=gor@linux.ibm.com \
--cc=gregkh@linuxfoundation.org \
--cc=heiko.carstens@de.ibm.com \
--cc=hpa@zytor.com \
--cc=ink@jurassic.park.msu.ru \
--cc=jeffrey.t.kirsher@intel.com \
--cc=jhogan@kernel.org \
--cc=jiaxun.yang@flygoat.com \
--cc=len.brown@intel.com \
--cc=lenb@kernel.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-alpha@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux-sh@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=luto@kernel.org \
--cc=mattst88@gmail.com \
--cc=mhocko@kernel.org \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=mwb@linux.vnet.ibm.com \
--cc=naveen.n.rao@linux.vnet.ibm.com \
--cc=paul.burton@mips.com \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=ralf@linux-mips.org \
--cc=rjw@rjwysocki.net \
--cc=robin.murphy@arm.com \
--cc=rppt@linux.ibm.com \
--cc=rth@twiddle.net \
--cc=sparclinux@vger.kernel.org \
--cc=tbogendoerfer@suse.de \
--cc=tglx@linutronix.de \
--cc=will@kernel.org \
--cc=x86@kernel.org \
--cc=ysato@users.sourceforge.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).