Hi, We are debugging an issue that skx_pci_uncores cannot be registered on 8-socket system with Xeon Platinum 8176 CPUs. After poking around for a while, I found it is caused by snbep_pci2phy_map_init() couldn't find a unbox_dev: ubox_dev = pci_get_device(PCI_VENDOR_ID_INTEL, devid, ubox_dev); unbox_dev == NULL ... The same kernel (Linus' master) works fine on some single socket SKX systems. I am not sure what to check next. And I am not sure whether this is specific to this system (HPE Superdome Flex). One thing I noticed is that the PCI configuration space shows subsystem vendor ID of 0x1590 instead of 0x8086: 0000:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 90 15 14 20 << subsystem vendor 30: 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 But I don't think that is the problem as the code search with PCI_ANY_ID. Any suggestions on what might be broken? And what to try next? Thanks in advance! Song
On 1/25/2019 1:54 PM, Song Liu wrote: > Hi, > > We are debugging an issue that skx_pci_uncores cannot be registered on > 8-socket system with Xeon Platinum 8176 CPUs. After poking around for a > while, I found it is caused by snbep_pci2phy_map_init() couldn't find > a unbox_dev: > > ubox_dev = pci_get_device(PCI_VENDOR_ID_INTEL, devid, ubox_dev); > unbox_dev == NULL > ... > > The same kernel (Linus' master) works fine on some single socket SKX > systems. > > I am not sure what to check next. And I am not sure whether this is > specific to this system (HPE Superdome Flex). Could you please share the offset 0xC0 and 0xD4 of the PCI configuration space for each device which PCI ID is 0x2014? snbep_pci2phy_map_init() tries to build a mapping from BUS# to Socket ID. CPUNODEID (0xc0) discloses the Node ID of current BUS. GIDNIDMAP (0xd4) discloses the mapping between Socket ID and Node ID. Here is an example from a 4 socket SKX. BUS CPUNODEID(bit2:0) GIDNIDMAP 0x0 0x0 0x688 0x40 0x1 0x688 0x80 0x2 0x688 0xC0 0x3 0x688 > > One thing I noticed is that the PCI configuration space shows > subsystem vendor ID of 0x1590 instead of 0x8086: > > 0000:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) > 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 > 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 20: 00 00 00 00 00 00 00 00 00 00 00 00 90 15 14 20 << subsystem vendor > > 30: 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 > > But I don't think that is the problem as the code search with PCI_ANY_ID. > It looks for the device with PCI ID 0x2014. Thanks, Kan
Thanks Kan! > On Jan 25, 2019, at 12:08 PM, Liang, Kan <kan.liang@linux.intel.com> wrote: > > > > On 1/25/2019 1:54 PM, Song Liu wrote: >> Hi, >> We are debugging an issue that skx_pci_uncores cannot be registered on >> 8-socket system with Xeon Platinum 8176 CPUs. After poking around for a >> while, I found it is caused by snbep_pci2phy_map_init() couldn't find >> a unbox_dev: >> ubox_dev = pci_get_device(PCI_VENDOR_ID_INTEL, devid, ubox_dev); >> unbox_dev == NULL >> ... >> The same kernel (Linus' master) works fine on some single socket SKX >> systems. >> I am not sure what to check next. And I am not sure whether this is >> specific to this system (HPE Superdome Flex). > > Could you please share the offset 0xC0 and 0xD4 of the PCI configuration space for each device which PCI ID is 0x2014? > > snbep_pci2phy_map_init() tries to build a mapping from BUS# to Socket ID. > CPUNODEID (0xc0) discloses the Node ID of current BUS. > GIDNIDMAP (0xd4) discloses the mapping between Socket ID and Node ID. > > Here is an example from a 4 socket SKX. > BUS CPUNODEID(bit2:0) GIDNIDMAP > 0x0 0x0 0x688 > 0x40 0x1 0x688 > 0x80 0x2 0x688 > 0xC0 0x3 0x688 > Here is the data I get: # lspci -xxx | grep "86 80 14 20" -A 15 -B 1 | grep -e "86 80 14 20" -e c0: -e d0: -e Intel 0000:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 c0: 00 a0 00 00 2f 00 00 80 01 00 02 00 2f 2f 2f 20 d0: 02 00 00 00 88 d6 b6 00 01 00 00 00 00 00 00 00 0001:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 c0: 01 80 00 00 1f 00 00 80 01 00 02 00 1f 1f 1f 10 d0: 02 00 00 00 88 46 92 00 01 00 00 00 00 00 00 00 0002:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 c0: 02 e0 00 00 8f 00 00 80 01 00 02 00 8f 8f 8f 80 d0: 02 00 00 00 88 f6 ff 00 01 00 00 00 00 00 00 00 0003:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 c0: 03 c0 00 00 4f 00 00 80 01 00 02 00 4f 4f 4f 40 d0: 02 00 00 00 88 66 db 00 01 00 00 00 00 00 00 00 0004:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 c0: a0 b4 00 00 2f 00 00 80 01 00 02 00 2f 2f 2f 20 d0: 02 00 00 00 6d 8b 68 00 01 00 00 00 00 00 00 00 0005:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 c0: 81 90 00 00 1f 00 00 80 01 00 02 00 1f 1f 1f 10 d0: 02 00 00 00 24 89 68 00 01 00 00 00 00 00 00 00 0006:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 c0: e2 fc 00 00 8f 00 00 80 01 00 02 00 8f 8f 8f 80 d0: 02 00 00 00 ff 8f 68 00 01 00 00 00 00 00 00 00 0007:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 c0: c3 d8 00 00 4f 00 00 80 01 00 02 00 4f 4f 4f 40 d0: 02 00 00 00 b6 8d 68 00 01 00 00 00 00 00 00 00 Song > >> One thing I noticed is that the PCI configuration space shows >> subsystem vendor ID of 0x1590 instead of 0x8086: >> 0000:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) >> 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 >> 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> 20: 00 00 00 00 00 00 00 00 00 00 00 00 90 15 14 20 << subsystem vendor >> 30: 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 >> But I don't think that is the problem as the code search with PCI_ANY_ID. >> > > It looks for the device with PCI ID 0x2014. > > > Thanks, > Kan
On 1/25/2019 3:16 PM, Song Liu wrote: > Thanks Kan! > >> On Jan 25, 2019, at 12:08 PM, Liang, Kan <kan.liang@linux.intel.com> wrote: >> >> >> >> On 1/25/2019 1:54 PM, Song Liu wrote: >>> Hi, >>> We are debugging an issue that skx_pci_uncores cannot be registered on >>> 8-socket system with Xeon Platinum 8176 CPUs. After poking around for a >>> while, I found it is caused by snbep_pci2phy_map_init() couldn't find >>> a unbox_dev: >>> ubox_dev = pci_get_device(PCI_VENDOR_ID_INTEL, devid, ubox_dev); >>> unbox_dev == NULL >>> ... >>> The same kernel (Linus' master) works fine on some single socket SKX >>> systems. >>> I am not sure what to check next. And I am not sure whether this is >>> specific to this system (HPE Superdome Flex). >> >> Could you please share the offset 0xC0 and 0xD4 of the PCI configuration space for each device which PCI ID is 0x2014? >> >> snbep_pci2phy_map_init() tries to build a mapping from BUS# to Socket ID. >> CPUNODEID (0xc0) discloses the Node ID of current BUS. >> GIDNIDMAP (0xd4) discloses the mapping between Socket ID and Node ID. >> >> Here is an example from a 4 socket SKX. >> BUS CPUNODEID(bit2:0) GIDNIDMAP >> 0x0 0x0 0x688 >> 0x40 0x1 0x688 >> 0x80 0x2 0x688 >> 0xC0 0x3 0x688 >> > > Here is the data I get: > > # lspci -xxx | grep "86 80 14 20" -A 15 -B 1 | grep -e "86 80 14 20" -e c0: -e d0: -e Intel > 0000:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) > 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 > c0: 00 a0 00 00 2f 00 00 80 01 00 02 00 2f 2f 2f 20 > d0: 02 00 00 00 88 d6 b6 00 01 00 00 00 00 00 00 00 > > 0001:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) > 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 > c0: 01 80 00 00 1f 00 00 80 01 00 02 00 1f 1f 1f 10 > d0: 02 00 00 00 88 46 92 00 01 00 00 00 00 00 00 00 > > 0002:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) > 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 > c0: 02 e0 00 00 8f 00 00 80 01 00 02 00 8f 8f 8f 80 > d0: 02 00 00 00 88 f6 ff 00 01 00 00 00 00 00 00 00 > > 0003:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) > 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 > c0: 03 c0 00 00 4f 00 00 80 01 00 02 00 4f 4f 4f 40 > d0: 02 00 00 00 88 66 db 00 01 00 00 00 00 00 00 00 > > 0004:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) > 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 > c0: a0 b4 00 00 2f 00 00 80 01 00 02 00 2f 2f 2f 20 The local node ID should be bit2:0. We didn't mask it in our codes. Does the patch as below work? diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c index c07bee3..15a8e3c 100644 --- a/arch/x86/events/intel/uncore_snbep.c +++ b/arch/x86/events/intel/uncore_snbep.c @@ -1222,6 +1222,8 @@ static struct pci_driver snbep_uncore_pci_driver = { .id_table = snbep_uncore_pci_ids, }; +#define NODE_ID_MASK 0x7 + /* * build pci bus to socket mapping */ @@ -1243,7 +1245,7 @@ static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool err = pci_read_config_dword(ubox_dev, nodeid_loc, &config); if (err) break; - nodeid = config; + nodeid = config & NODE_ID_MASK; /* get the Node ID mapping */ err = pci_read_config_dword(ubox_dev, idmap_loc, &config); if (err) Thanks, Kan > d0: 02 00 00 00 6d 8b 68 00 01 00 00 00 00 00 00 00 > > 0005:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) > 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 > c0: 81 90 00 00 1f 00 00 80 01 00 02 00 1f 1f 1f 10 > d0: 02 00 00 00 24 89 68 00 01 00 00 00 00 00 00 00 > > 0006:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) > 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 > c0: e2 fc 00 00 8f 00 00 80 01 00 02 00 8f 8f 8f 80 > d0: 02 00 00 00 ff 8f 68 00 01 00 00 00 00 00 00 00 > > 0007:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) > 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 > c0: c3 d8 00 00 4f 00 00 80 01 00 02 00 4f 4f 4f 40 > d0: 02 00 00 00 b6 8d 68 00 01 00 00 00 00 00 00 00 > > Song >> >>> One thing I noticed is that the PCI configuration space shows >>> subsystem vendor ID of 0x1590 instead of 0x8086: >>> 0000:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) >>> 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 >>> 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>> 20: 00 00 00 00 00 00 00 00 00 00 00 00 90 15 14 20 << subsystem vendor >>> 30: 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 >>> But I don't think that is the problem as the code search with PCI_ANY_ID. >>> >> >> It looks for the device with PCI ID 0x2014. >> >> >> Thanks, >> Kan >
> On Jan 25, 2019, at 1:35 PM, Liang, Kan <kan.liang@linux.intel.com> wrote: > > > > On 1/25/2019 3:16 PM, Song Liu wrote: >> Thanks Kan! >>> On Jan 25, 2019, at 12:08 PM, Liang, Kan <kan.liang@linux.intel.com> wrote: >>> >>> >>> >>> On 1/25/2019 1:54 PM, Song Liu wrote: >>>> Hi, >>>> We are debugging an issue that skx_pci_uncores cannot be registered on >>>> 8-socket system with Xeon Platinum 8176 CPUs. After poking around for a >>>> while, I found it is caused by snbep_pci2phy_map_init() couldn't find >>>> a unbox_dev: >>>> ubox_dev = pci_get_device(PCI_VENDOR_ID_INTEL, devid, ubox_dev); >>>> unbox_dev == NULL >>>> ... >>>> The same kernel (Linus' master) works fine on some single socket SKX >>>> systems. >>>> I am not sure what to check next. And I am not sure whether this is >>>> specific to this system (HPE Superdome Flex). >>> >>> Could you please share the offset 0xC0 and 0xD4 of the PCI configuration space for each device which PCI ID is 0x2014? >>> >>> snbep_pci2phy_map_init() tries to build a mapping from BUS# to Socket ID. >>> CPUNODEID (0xc0) discloses the Node ID of current BUS. >>> GIDNIDMAP (0xd4) discloses the mapping between Socket ID and Node ID. >>> >>> Here is an example from a 4 socket SKX. >>> BUS CPUNODEID(bit2:0) GIDNIDMAP >>> 0x0 0x0 0x688 >>> 0x40 0x1 0x688 >>> 0x80 0x2 0x688 >>> 0xC0 0x3 0x688 >>> >> Here is the data I get: >> # lspci -xxx | grep "86 80 14 20" -A 15 -B 1 | grep -e "86 80 14 20" -e c0: -e d0: -e Intel >> 0000:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) >> 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 >> c0: 00 a0 00 00 2f 00 00 80 01 00 02 00 2f 2f 2f 20 >> d0: 02 00 00 00 88 d6 b6 00 01 00 00 00 00 00 00 00 >> 0001:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) >> 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 >> c0: 01 80 00 00 1f 00 00 80 01 00 02 00 1f 1f 1f 10 >> d0: 02 00 00 00 88 46 92 00 01 00 00 00 00 00 00 00 >> 0002:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) >> 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 >> c0: 02 e0 00 00 8f 00 00 80 01 00 02 00 8f 8f 8f 80 >> d0: 02 00 00 00 88 f6 ff 00 01 00 00 00 00 00 00 00 >> 0003:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) >> 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 >> c0: 03 c0 00 00 4f 00 00 80 01 00 02 00 4f 4f 4f 40 >> d0: 02 00 00 00 88 66 db 00 01 00 00 00 00 00 00 00 >> 0004:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) >> 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 >> c0: a0 b4 00 00 2f 00 00 80 01 00 02 00 2f 2f 2f 20 > > The local node ID should be bit2:0. We didn't mask it in our codes. > Does the patch as below work? > > diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c > index c07bee3..15a8e3c 100644 > --- a/arch/x86/events/intel/uncore_snbep.c > +++ b/arch/x86/events/intel/uncore_snbep.c > @@ -1222,6 +1222,8 @@ static struct pci_driver snbep_uncore_pci_driver = { > .id_table = snbep_uncore_pci_ids, > }; > > +#define NODE_ID_MASK 0x7 > + > /* > * build pci bus to socket mapping > */ > @@ -1243,7 +1245,7 @@ static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool > err = pci_read_config_dword(ubox_dev, nodeid_loc, &config); > if (err) > break; > - nodeid = config; > + nodeid = config & NODE_ID_MASK; > /* get the Node ID mapping */ > err = pci_read_config_dword(ubox_dev, idmap_loc, &config); > if (err) > > > Thanks, > Kan Yes, this patch works! Now I can see uncore_imc_0 etc. Thanks Kan! I also noticed that uncore_pci_probe() returns at if (atomic_inc_return(&pmu->activeboxes) > 1) return 0; for bus 1-7. So we only probed bus 0 (socket 0). Is this expected behavior? Thanks again! Song >> d0: 02 00 00 00 6d 8b 68 00 01 00 00 00 00 00 00 00 >> 0005:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) >> 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 >> c0: 81 90 00 00 1f 00 00 80 01 00 02 00 1f 1f 1f 10 >> d0: 02 00 00 00 24 89 68 00 01 00 00 00 00 00 00 00 >> 0006:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) >> 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 >> c0: e2 fc 00 00 8f 00 00 80 01 00 02 00 8f 8f 8f 80 >> d0: 02 00 00 00 ff 8f 68 00 01 00 00 00 00 00 00 00 >> 0007:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) >> 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 >> c0: c3 d8 00 00 4f 00 00 80 01 00 02 00 4f 4f 4f 40 >> d0: 02 00 00 00 b6 8d 68 00 01 00 00 00 00 00 00 00 >> Song >>> >>>> One thing I noticed is that the PCI configuration space shows >>>> subsystem vendor ID of 0x1590 instead of 0x8086: >>>> 0000:00:08.0 System peripheral: Intel Corporation Sky Lake-E Ubox Registers (rev 04) >>>> 00: 86 80 14 20 00 00 10 00 04 00 80 08 00 00 80 00 >>>> 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >>>> 20: 00 00 00 00 00 00 00 00 00 00 00 00 90 15 14 20 << subsystem vendor >>>> 30: 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 >>>> But I don't think that is the problem as the code search with PCI_ANY_ID. >>>> >>> >>> It looks for the device with PCI ID 0x2014. >>> >>> >>> Thanks, >>> Kan
On 1/25/2019 5:24 PM, Song Liu wrote:
> I also noticed that uncore_pci_probe() returns at
>
> if (atomic_inc_return(&pmu->activeboxes) > 1)
> return 0;
>
> for bus 1-7. So we only probed bus 0 (socket 0). Is this expected behavior?
We probed all, but only need to register PMU once.
It can be distinguished by cpumask. E.g. you may want to check
"/sys/devices/uncore_imc_0/cpumask".
Thanks,
Kan
> On Jan 25, 2019, at 2:42 PM, Liang, Kan <kan.liang@linux.intel.com> wrote:
>
>
>
> On 1/25/2019 5:24 PM, Song Liu wrote:
>> I also noticed that uncore_pci_probe() returns at
>> if (atomic_inc_return(&pmu->activeboxes) > 1)
>> return 0;
>> for bus 1-7. So we only probed bus 0 (socket 0). Is this expected behavior?
>
> We probed all, but only need to register PMU once.
> It can be distinguished by cpumask. E.g. you may want to check "/sys/devices/uncore_imc_0/cpumask".
>
>
> Thanks,
> Kan
I see. That works:
[root@rtptest10787.snc1 /sys/bus/event_source/devices/uncore_imc_0]# cat cpumask
0,28,56,84,112,140,168,196
I think this solves all the problems.
Thanks again!
Song