All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4] eal: Set numa node value for system which not support it.
@ 2017-05-11  1:56 Tonghao Zhang
  2017-06-22 15:15 ` Sergio Gonzalez Monroy
  0 siblings, 1 reply; 7+ messages in thread
From: Tonghao Zhang @ 2017-05-11  1:56 UTC (permalink / raw)
  To: dev; +Cc: Tonghao Zhang

The NUMA node information for PCI devices provided through
sysfs is invalid for AMD Opteron(TM) Processor 62xx and 63xx
on Red Hat Enterprise Linux 6, and VMs on some hypervisors.
It is good to see more checking for valid values.

Signed-off-by: Tonghao Zhang <nic@opencloud.tech>
---
 lib/librte_eal/linuxapp/eal/eal_pci.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 595622b..c817b4c 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -310,18 +310,18 @@
 			dev->max_vfs = (uint16_t)tmp;
 	}
 
-	/* get numa node */
+	/* get numa node, default to 0 if not present */
 	snprintf(filename, sizeof(filename), "%s/numa_node",
 		 dirname);
-	if (access(filename, R_OK) != 0) {
-		/* if no NUMA support, set default to 0 */
-		dev->device.numa_node = 0;
-	} else {
-		if (eal_parse_sysfs_value(filename, &tmp) < 0) {
-			free(dev);
-			return -1;
-		}
+
+	if (eal_parse_sysfs_value(filename, &tmp) == 0 &&
+		tmp < RTE_MAX_NUMA_NODES)
 		dev->device.numa_node = tmp;
+	else {
+		RTE_LOG(WARNING, EAL,
+			"numa_node is invalid or not present. "
+			"Set it 0 as default\n");
+		dev->device.numa_node = 0;
 	}
 
 	rte_pci_device_name(addr, dev->name, sizeof(dev->name));
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] eal: Set numa node value for system which not support it.
  2017-05-11  1:56 [PATCH v4] eal: Set numa node value for system which not support it Tonghao Zhang
@ 2017-06-22 15:15 ` Sergio Gonzalez Monroy
  2017-06-23 13:02   ` Thomas Monjalon
  0 siblings, 1 reply; 7+ messages in thread
From: Sergio Gonzalez Monroy @ 2017-06-22 15:15 UTC (permalink / raw)
  To: Tonghao Zhang, dev; +Cc: Thomas Monjalon

Just fyi, the summary line should be lowercase apart from acronyms (DPDK 
guidelines).

On 11/05/2017 02:56, Tonghao Zhang wrote:
> The NUMA node information for PCI devices provided through
> sysfs is invalid for AMD Opteron(TM) Processor 62xx and 63xx
> on Red Hat Enterprise Linux 6, and VMs on some hypervisors.
> It is good to see more checking for valid values.
>
> Signed-off-by: Tonghao Zhang <nic@opencloud.tech>
> ---

IMHO the message could be slightly improved by adding some of the 
replies that you made to your v3.
ie. Typical wrong numa node in VMs

$ cat /sys/devices/pci0000:00/0000:00:18.6/numa_node
-1

>   lib/librte_eal/linuxapp/eal/eal_pci.c | 18 +++++++++---------
>   1 file changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
> index 595622b..c817b4c 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_pci.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
> @@ -310,18 +310,18 @@
>   			dev->max_vfs = (uint16_t)tmp;
>   	}
>   
> -	/* get numa node */
> +	/* get numa node, default to 0 if not present */
>   	snprintf(filename, sizeof(filename), "%s/numa_node",
>   		 dirname);
> -	if (access(filename, R_OK) != 0) {
> -		/* if no NUMA support, set default to 0 */
> -		dev->device.numa_node = 0;
> -	} else {
> -		if (eal_parse_sysfs_value(filename, &tmp) < 0) {
> -			free(dev);
> -			return -1;
> -		}
> +
> +	if (eal_parse_sysfs_value(filename, &tmp) == 0 &&
> +		tmp < RTE_MAX_NUMA_NODES)
>   		dev->device.numa_node = tmp;
> +	else {
> +		RTE_LOG(WARNING, EAL,
> +			"numa_node is invalid or not present. "
> +			"Set it 0 as default\n");
> +		dev->device.numa_node = 0;
>   	}
>   
>   	rte_pci_device_name(addr, dev->name, sizeof(dev->name));

The code changes look fine, so I leave it to Thomas regarding the commit 
message :)

Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] eal: Set numa node value for system which not support it.
  2017-06-22 15:15 ` Sergio Gonzalez Monroy
@ 2017-06-23 13:02   ` Thomas Monjalon
  2017-06-26  9:14     ` Sergio Gonzalez Monroy
  0 siblings, 1 reply; 7+ messages in thread
From: Thomas Monjalon @ 2017-06-23 13:02 UTC (permalink / raw)
  To: Tonghao Zhang; +Cc: dev, Sergio Gonzalez Monroy

22/06/2017 17:15, Sergio Gonzalez Monroy:
> Just fyi, the summary line should be lowercase apart from acronyms (DPDK 
> guidelines).
> 
> On 11/05/2017 02:56, Tonghao Zhang wrote:
> > The NUMA node information for PCI devices provided through
> > sysfs is invalid for AMD Opteron(TM) Processor 62xx and 63xx
> > on Red Hat Enterprise Linux 6, and VMs on some hypervisors.
> > It is good to see more checking for valid values.
> >
> > Signed-off-by: Tonghao Zhang <nic@opencloud.tech>
> > ---
> 
> IMHO the message could be slightly improved by adding some of the 
> replies that you made to your v3.
> ie. Typical wrong numa node in VMs
> 
> $ cat /sys/devices/pci0000:00/0000:00:18.6/numa_node
> -1
[...]
> The code changes look fine, so I leave it to Thomas regarding the commit 
> message :)
> 
> Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>

Applied, thanks

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] eal: Set numa node value for system which not support it.
  2017-06-23 13:02   ` Thomas Monjalon
@ 2017-06-26  9:14     ` Sergio Gonzalez Monroy
  2017-06-26  9:39       ` Thomas Monjalon
  0 siblings, 1 reply; 7+ messages in thread
From: Sergio Gonzalez Monroy @ 2017-06-26  9:14 UTC (permalink / raw)
  To: Thomas Monjalon, Tonghao Zhang; +Cc: dev

On 23/06/2017 14:02, Thomas Monjalon wrote:
> 22/06/2017 17:15, Sergio Gonzalez Monroy:
>> Just fyi, the summary line should be lowercase apart from acronyms (DPDK
>> guidelines).
>>
>> On 11/05/2017 02:56, Tonghao Zhang wrote:
>>> The NUMA node information for PCI devices provided through
>>> sysfs is invalid for AMD Opteron(TM) Processor 62xx and 63xx
>>> on Red Hat Enterprise Linux 6, and VMs on some hypervisors.
>>> It is good to see more checking for valid values.
>>>
>>> Signed-off-by: Tonghao Zhang <nic@opencloud.tech>
>>> ---
>> IMHO the message could be slightly improved by adding some of the
>> replies that you made to your v3.
>> ie. Typical wrong numa node in VMs
>>
>> $ cat /sys/devices/pci0000:00/0000:00:18.6/numa_node
>> -1
> [...]
>> The code changes look fine, so I leave it to Thomas regarding the commit
>> message :)
>>
>> Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
> Applied, thanks

It looks like some systems have quite a few devices that report -1 as 
numa_node value causing lots of warning messages being printed.
Quick fixes that come to mind would be:
1) Change log level to DEBUG
2) Add static var to only print the message once.

I also think that the message itself should show at least the BDF to at 
least know which devices are reporting bad numa_node values.

Thoughts?

Sergio

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] eal: Set numa node value for system which not support it.
  2017-06-26  9:14     ` Sergio Gonzalez Monroy
@ 2017-06-26  9:39       ` Thomas Monjalon
  2017-06-26 12:50         ` Sergio Gonzalez Monroy
  0 siblings, 1 reply; 7+ messages in thread
From: Thomas Monjalon @ 2017-06-26  9:39 UTC (permalink / raw)
  To: Sergio Gonzalez Monroy; +Cc: Tonghao Zhang, dev

26/06/2017 11:14, Sergio Gonzalez Monroy:
> On 23/06/2017 14:02, Thomas Monjalon wrote:
> > 22/06/2017 17:15, Sergio Gonzalez Monroy:
> >> Just fyi, the summary line should be lowercase apart from acronyms (DPDK
> >> guidelines).
> >>
> >> On 11/05/2017 02:56, Tonghao Zhang wrote:
> >>> The NUMA node information for PCI devices provided through
> >>> sysfs is invalid for AMD Opteron(TM) Processor 62xx and 63xx
> >>> on Red Hat Enterprise Linux 6, and VMs on some hypervisors.
> >>> It is good to see more checking for valid values.
> >>>
> >>> Signed-off-by: Tonghao Zhang <nic@opencloud.tech>
> >>> ---
> >> IMHO the message could be slightly improved by adding some of the
> >> replies that you made to your v3.
> >> ie. Typical wrong numa node in VMs
> >>
> >> $ cat /sys/devices/pci0000:00/0000:00:18.6/numa_node
> >> -1
> > [...]
> >> The code changes look fine, so I leave it to Thomas regarding the commit
> >> message :)
> >>
> >> Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
> > Applied, thanks
> 
> It looks like some systems have quite a few devices that report -1 as 
> numa_node value causing lots of warning messages being printed.
> Quick fixes that come to mind would be:
> 1) Change log level to DEBUG

As it is important for performance, it should not be just for DEBUG.

> 2) Add static var to only print the message once.

Yes good idea.

> I also think that the message itself should show at least the BDF to at 
> least know which devices are reporting bad numa_node values.

With the static variable, we will have only the first device BDF.
Is it relevant?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] eal: Set numa node value for system which not support it.
  2017-06-26  9:39       ` Thomas Monjalon
@ 2017-06-26 12:50         ` Sergio Gonzalez Monroy
  2017-06-26 14:36           ` Thomas Monjalon
  0 siblings, 1 reply; 7+ messages in thread
From: Sergio Gonzalez Monroy @ 2017-06-26 12:50 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: Tonghao Zhang, dev

On 26/06/2017 10:39, Thomas Monjalon wrote:
> 26/06/2017 11:14, Sergio Gonzalez Monroy:
>> On 23/06/2017 14:02, Thomas Monjalon wrote:
>>> 22/06/2017 17:15, Sergio Gonzalez Monroy:
>>>> Just fyi, the summary line should be lowercase apart from acronyms (DPDK
>>>> guidelines).
>>>>
>>>> On 11/05/2017 02:56, Tonghao Zhang wrote:
>>>>> The NUMA node information for PCI devices provided through
>>>>> sysfs is invalid for AMD Opteron(TM) Processor 62xx and 63xx
>>>>> on Red Hat Enterprise Linux 6, and VMs on some hypervisors.
>>>>> It is good to see more checking for valid values.
>>>>>
>>>>> Signed-off-by: Tonghao Zhang <nic@opencloud.tech>
>>>>> ---
>>>> IMHO the message could be slightly improved by adding some of the
>>>> replies that you made to your v3.
>>>> ie. Typical wrong numa node in VMs
>>>>
>>>> $ cat /sys/devices/pci0000:00/0000:00:18.6/numa_node
>>>> -1
>>> [...]
>>>> The code changes look fine, so I leave it to Thomas regarding the commit
>>>> message :)
>>>>
>>>> Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
>>> Applied, thanks
>> It looks like some systems have quite a few devices that report -1 as
>> numa_node value causing lots of warning messages being printed.
>> Quick fixes that come to mind would be:
>> 1) Change log level to DEBUG
> As it is important for performance, it should not be just for DEBUG.
>
>> 2) Add static var to only print the message once.
> Yes good idea.
>
>> I also think that the message itself should show at least the BDF to at
>> least know which devices are reporting bad numa_node values.
> With the static variable, we will have only the first device BDF.
> Is it relevant?
>

I think it is relevant if it affects a device used by DPDK, but we don't 
know that when doing full pci_scan.

At least on x86 platforms we usually see many PCI devices without numa_node:
ls /sys/bus/pci/devices | xargs -n 1 -I {} head -v 
"/sys/bus/pci/devices/{}/numa_node"

A single warning is not going to mean much if all platforms have PCI 
devices without proper numa_node, right?

A more cleaner solution might be to leave -1 if we failed to parse 
numa_node, then on rte_pci_probe_one_driver after checking if it is 
blacklisted check if socket_id is -1 and show warning message defaulting 
to 0?

I would be inclined to:
a) leave it as it is with DEBUG log level, also showing PCI BDF (very 
noisy in debug mode).
b) show the warning and default to 0 in rte_pci_probe_one_driver, 
showing only relevant devices.

Sergio

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4] eal: Set numa node value for system which not support it.
  2017-06-26 12:50         ` Sergio Gonzalez Monroy
@ 2017-06-26 14:36           ` Thomas Monjalon
  0 siblings, 0 replies; 7+ messages in thread
From: Thomas Monjalon @ 2017-06-26 14:36 UTC (permalink / raw)
  To: Sergio Gonzalez Monroy; +Cc: Tonghao Zhang, dev

26/06/2017 14:50, Sergio Gonzalez Monroy:
> On 26/06/2017 10:39, Thomas Monjalon wrote:
> > 26/06/2017 11:14, Sergio Gonzalez Monroy:
> >> On 23/06/2017 14:02, Thomas Monjalon wrote:
> >>> 22/06/2017 17:15, Sergio Gonzalez Monroy:
> >>>> Just fyi, the summary line should be lowercase apart from acronyms (DPDK
> >>>> guidelines).
> >>>>
> >>>> On 11/05/2017 02:56, Tonghao Zhang wrote:
> >>>>> The NUMA node information for PCI devices provided through
> >>>>> sysfs is invalid for AMD Opteron(TM) Processor 62xx and 63xx
> >>>>> on Red Hat Enterprise Linux 6, and VMs on some hypervisors.
> >>>>> It is good to see more checking for valid values.
> >>>>>
> >>>>> Signed-off-by: Tonghao Zhang <nic@opencloud.tech>
> >>>>> ---
> >>>> IMHO the message could be slightly improved by adding some of the
> >>>> replies that you made to your v3.
> >>>> ie. Typical wrong numa node in VMs
> >>>>
> >>>> $ cat /sys/devices/pci0000:00/0000:00:18.6/numa_node
> >>>> -1
> >>> [...]
> >>>> The code changes look fine, so I leave it to Thomas regarding the commit
> >>>> message :)
> >>>>
> >>>> Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
> >>> Applied, thanks
> >> It looks like some systems have quite a few devices that report -1 as
> >> numa_node value causing lots of warning messages being printed.
> >> Quick fixes that come to mind would be:
> >> 1) Change log level to DEBUG
> > As it is important for performance, it should not be just for DEBUG.
> >
> >> 2) Add static var to only print the message once.
> > Yes good idea.
> >
> >> I also think that the message itself should show at least the BDF to at
> >> least know which devices are reporting bad numa_node values.
> > With the static variable, we will have only the first device BDF.
> > Is it relevant?
> >
> 
> I think it is relevant if it affects a device used by DPDK, but we don't 
> know that when doing full pci_scan.
> 
> At least on x86 platforms we usually see many PCI devices without numa_node:
> ls /sys/bus/pci/devices | xargs -n 1 -I {} head -v 
> "/sys/bus/pci/devices/{}/numa_node"
> 
> A single warning is not going to mean much if all platforms have PCI 
> devices without proper numa_node, right?
> 
> A more cleaner solution might be to leave -1 if we failed to parse 
> numa_node, then on rte_pci_probe_one_driver after checking if it is 
> blacklisted check if socket_id is -1 and show warning message defaulting 
> to 0?
> 
> I would be inclined to:
> a) leave it as it is with DEBUG log level, also showing PCI BDF (very 
> noisy in debug mode).
> b) show the warning and default to 0 in rte_pci_probe_one_driver, 
> showing only relevant devices.

Looks a good proposal Sergio!

Thanks

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-06-26 14:36 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-11  1:56 [PATCH v4] eal: Set numa node value for system which not support it Tonghao Zhang
2017-06-22 15:15 ` Sergio Gonzalez Monroy
2017-06-23 13:02   ` Thomas Monjalon
2017-06-26  9:14     ` Sergio Gonzalez Monroy
2017-06-26  9:39       ` Thomas Monjalon
2017-06-26 12:50         ` Sergio Gonzalez Monroy
2017-06-26 14:36           ` Thomas Monjalon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.