* [PATCH kernel] vfio_pci_nvlink2: Do not attempt NPU2 setup on old P8's NPU
@ 2020-11-13 5:06 Alexey Kardashevskiy
2020-11-13 5:30 ` Andrew Donnellan
0 siblings, 1 reply; 4+ messages in thread
From: Alexey Kardashevskiy @ 2020-11-13 5:06 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Alexey Kardashevskiy, Alex Williamson, kvm, David Gibson
We execute certain NPU2 setup code (such as mapping an LPID to a device
in NPU2) unconditionally if an Nvlink bridge is detected. However this
cannot succeed on P8+ machines as the init helpers return an error other
than ENODEV which means the device is there is and setup failed so
vfio_pci_enable() fails and pass through is not possible.
This changes the two NPU2 related init helpers to return -ENODEV if
there is no "memory-region" device tree property as this is
the distinction between NPU and NPU2.
Fixes: 7f92891778df ("vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
drivers/vfio/pci/vfio_pci_nvlink2.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/vfio/pci/vfio_pci_nvlink2.c b/drivers/vfio/pci/vfio_pci_nvlink2.c
index 65c61710c0e9..9adcf6a8f888 100644
--- a/drivers/vfio/pci/vfio_pci_nvlink2.c
+++ b/drivers/vfio/pci/vfio_pci_nvlink2.c
@@ -231,7 +231,7 @@ int vfio_pci_nvdia_v100_nvlink2_init(struct vfio_pci_device *vdev)
return -EINVAL;
if (of_property_read_u32(npu_node, "memory-region", &mem_phandle))
- return -EINVAL;
+ return -ENODEV;
mem_node = of_find_node_by_phandle(mem_phandle);
if (!mem_node)
@@ -393,7 +393,7 @@ int vfio_pci_ibm_npu2_init(struct vfio_pci_device *vdev)
int ret;
struct vfio_pci_npu2_data *data;
struct device_node *nvlink_dn;
- u32 nvlink_index = 0;
+ u32 nvlink_index = 0, mem_phandle = 0;
struct pci_dev *npdev = vdev->pdev;
struct device_node *npu_node = pci_device_to_OF_node(npdev);
struct pci_controller *hose = pci_bus_to_host(npdev->bus);
@@ -408,6 +408,9 @@ int vfio_pci_ibm_npu2_init(struct vfio_pci_device *vdev)
if (!pnv_pci_get_gpu_dev(vdev->pdev))
return -ENODEV;
+ if (of_property_read_u32(npu_node, "memory-region", &mem_phandle))
+ return -ENODEV;
+
/*
* NPU2 normally has 8 ATSD registers (for concurrency) and 6 links
* so we can allocate one register per link, using nvlink index as
--
2.17.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH kernel] vfio_pci_nvlink2: Do not attempt NPU2 setup on old P8's NPU
2020-11-13 5:06 [PATCH kernel] vfio_pci_nvlink2: Do not attempt NPU2 setup on old P8's NPU Alexey Kardashevskiy
@ 2020-11-13 5:30 ` Andrew Donnellan
2020-11-14 4:16 ` Alexey Kardashevskiy
0 siblings, 1 reply; 4+ messages in thread
From: Andrew Donnellan @ 2020-11-13 5:30 UTC (permalink / raw)
To: Alexey Kardashevskiy, linuxppc-dev; +Cc: Alex Williamson, kvm, David Gibson
On 13/11/20 4:06 pm, Alexey Kardashevskiy wrote:
> We execute certain NPU2 setup code (such as mapping an LPID to a device
> in NPU2) unconditionally if an Nvlink bridge is detected. However this
> cannot succeed on P8+ machines as the init helpers return an error other
> than ENODEV which means the device is there is and setup failed so
> vfio_pci_enable() fails and pass through is not possible.
>
> This changes the two NPU2 related init helpers to return -ENODEV if
> there is no "memory-region" device tree property as this is
> the distinction between NPU and NPU2.
>
> Fixes: 7f92891778df ("vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver")
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Should this be Cc: stable?
Andrew
--
Andrew Donnellan OzLabs, ADL Canberra
ajd@linux.ibm.com IBM Australia Limited
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH kernel] vfio_pci_nvlink2: Do not attempt NPU2 setup on old P8's NPU
2020-11-13 5:30 ` Andrew Donnellan
@ 2020-11-14 4:16 ` Alexey Kardashevskiy
2020-11-16 6:20 ` Michael Ellerman
0 siblings, 1 reply; 4+ messages in thread
From: Alexey Kardashevskiy @ 2020-11-14 4:16 UTC (permalink / raw)
To: Andrew Donnellan, linuxppc-dev
Cc: Leonardo Augusto Guimaraes Garcia, Alex Williamson, kvm, David Gibson
On 13/11/2020 16:30, Andrew Donnellan wrote:
> On 13/11/20 4:06 pm, Alexey Kardashevskiy wrote:
>> We execute certain NPU2 setup code (such as mapping an LPID to a device
>> in NPU2) unconditionally if an Nvlink bridge is detected. However this
>> cannot succeed on P8+ machines as the init helpers return an error other
>> than ENODEV which means the device is there is and setup failed so
>> vfio_pci_enable() fails and pass through is not possible.
>>
>> This changes the two NPU2 related init helpers to return -ENODEV if
>> there is no "memory-region" device tree property as this is
>> the distinction between NPU and NPU2.
>>
>> Fixes: 7f92891778df ("vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2]
>> subdriver")
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>
> Should this be Cc: stable?
This depends on whether P8+ + NVLink was ever a product (hi Leonardo)
and had actual customers who still rely on upstream kernels to work as
after many years only the last week I heard form some Redhat test
engineer that it does not work. May be cc: stable...
--
Alexey
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH kernel] vfio_pci_nvlink2: Do not attempt NPU2 setup on old P8's NPU
2020-11-14 4:16 ` Alexey Kardashevskiy
@ 2020-11-16 6:20 ` Michael Ellerman
0 siblings, 0 replies; 4+ messages in thread
From: Michael Ellerman @ 2020-11-16 6:20 UTC (permalink / raw)
To: Alexey Kardashevskiy, Andrew Donnellan, linuxppc-dev
Cc: Leonardo Augusto Guimaraes Garcia, Alex Williamson, kvm, David Gibson
Alexey Kardashevskiy <aik@ozlabs.ru> writes:
> On 13/11/2020 16:30, Andrew Donnellan wrote:
>> On 13/11/20 4:06 pm, Alexey Kardashevskiy wrote:
>>> We execute certain NPU2 setup code (such as mapping an LPID to a device
>>> in NPU2) unconditionally if an Nvlink bridge is detected. However this
>>> cannot succeed on P8+ machines as the init helpers return an error other
>>> than ENODEV which means the device is there is and setup failed so
>>> vfio_pci_enable() fails and pass through is not possible.
>>>
>>> This changes the two NPU2 related init helpers to return -ENODEV if
>>> there is no "memory-region" device tree property as this is
>>> the distinction between NPU and NPU2.
>>>
>>> Fixes: 7f92891778df ("vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2]
>>> subdriver")
>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>
>> Should this be Cc: stable?
>
> This depends on whether P8+ + NVLink was ever a product (hi Leonardo)
> and had actual customers who still rely on upstream kernels to work as
> after many years only the last week I heard form some Redhat test
> engineer that it does not work. May be cc: stable...
I don't think it really matters if it was a product or not. Upstream is
never a product anyway.
If the fix is simple and unlikely to introduce a regression, and would
potentially save someone having to debug the problem again, then it
should get backported to stable.
You should also clarify what you mean by "P8+", it won't be clear to
most readers if you mean "Power 8 and/or later" or specifically Naples /
Power8 NVL.
cheers
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-11-16 6:22 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-13 5:06 [PATCH kernel] vfio_pci_nvlink2: Do not attempt NPU2 setup on old P8's NPU Alexey Kardashevskiy
2020-11-13 5:30 ` Andrew Donnellan
2020-11-14 4:16 ` Alexey Kardashevskiy
2020-11-16 6:20 ` Michael Ellerman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).