linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH kernel] vfio_pci_nvlink2: Do not attempt NPU2 setup on old P8's NPU
@ 2020-11-13  5:06 Alexey Kardashevskiy
  2020-11-13  5:30 ` Andrew Donnellan
  0 siblings, 1 reply; 4+ messages in thread
From: Alexey Kardashevskiy @ 2020-11-13  5:06 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Alexey Kardashevskiy, Alex Williamson, kvm, David Gibson

We execute certain NPU2 setup code (such as mapping an LPID to a device
in NPU2) unconditionally if an Nvlink bridge is detected. However this
cannot succeed on P8+ machines as the init helpers return an error other
than ENODEV which means the device is there is and setup failed so
vfio_pci_enable() fails and pass through is not possible.

This changes the two NPU2 related init helpers to return -ENODEV if
there is no "memory-region" device tree property as this is
the distinction between NPU and NPU2.

Fixes: 7f92891778df ("vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 drivers/vfio/pci/vfio_pci_nvlink2.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_nvlink2.c b/drivers/vfio/pci/vfio_pci_nvlink2.c
index 65c61710c0e9..9adcf6a8f888 100644
--- a/drivers/vfio/pci/vfio_pci_nvlink2.c
+++ b/drivers/vfio/pci/vfio_pci_nvlink2.c
@@ -231,7 +231,7 @@ int vfio_pci_nvdia_v100_nvlink2_init(struct vfio_pci_device *vdev)
 		return -EINVAL;
 
 	if (of_property_read_u32(npu_node, "memory-region", &mem_phandle))
-		return -EINVAL;
+		return -ENODEV;
 
 	mem_node = of_find_node_by_phandle(mem_phandle);
 	if (!mem_node)
@@ -393,7 +393,7 @@ int vfio_pci_ibm_npu2_init(struct vfio_pci_device *vdev)
 	int ret;
 	struct vfio_pci_npu2_data *data;
 	struct device_node *nvlink_dn;
-	u32 nvlink_index = 0;
+	u32 nvlink_index = 0, mem_phandle = 0;
 	struct pci_dev *npdev = vdev->pdev;
 	struct device_node *npu_node = pci_device_to_OF_node(npdev);
 	struct pci_controller *hose = pci_bus_to_host(npdev->bus);
@@ -408,6 +408,9 @@ int vfio_pci_ibm_npu2_init(struct vfio_pci_device *vdev)
 	if (!pnv_pci_get_gpu_dev(vdev->pdev))
 		return -ENODEV;
 
+	if (of_property_read_u32(npu_node, "memory-region", &mem_phandle))
+		return -ENODEV;
+
 	/*
 	 * NPU2 normally has 8 ATSD registers (for concurrency) and 6 links
 	 * so we can allocate one register per link, using nvlink index as
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH kernel] vfio_pci_nvlink2: Do not attempt NPU2 setup on old P8's NPU
  2020-11-13  5:06 [PATCH kernel] vfio_pci_nvlink2: Do not attempt NPU2 setup on old P8's NPU Alexey Kardashevskiy
@ 2020-11-13  5:30 ` Andrew Donnellan
  2020-11-14  4:16   ` Alexey Kardashevskiy
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Donnellan @ 2020-11-13  5:30 UTC (permalink / raw)
  To: Alexey Kardashevskiy, linuxppc-dev; +Cc: Alex Williamson, kvm, David Gibson

On 13/11/20 4:06 pm, Alexey Kardashevskiy wrote:
> We execute certain NPU2 setup code (such as mapping an LPID to a device
> in NPU2) unconditionally if an Nvlink bridge is detected. However this
> cannot succeed on P8+ machines as the init helpers return an error other
> than ENODEV which means the device is there is and setup failed so
> vfio_pci_enable() fails and pass through is not possible.
> 
> This changes the two NPU2 related init helpers to return -ENODEV if
> there is no "memory-region" device tree property as this is
> the distinction between NPU and NPU2.
> 
> Fixes: 7f92891778df ("vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver")
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>

Should this be Cc: stable?


Andrew

-- 
Andrew Donnellan              OzLabs, ADL Canberra
ajd@linux.ibm.com             IBM Australia Limited

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH kernel] vfio_pci_nvlink2: Do not attempt NPU2 setup on old P8's NPU
  2020-11-13  5:30 ` Andrew Donnellan
@ 2020-11-14  4:16   ` Alexey Kardashevskiy
  2020-11-16  6:20     ` Michael Ellerman
  0 siblings, 1 reply; 4+ messages in thread
From: Alexey Kardashevskiy @ 2020-11-14  4:16 UTC (permalink / raw)
  To: Andrew Donnellan, linuxppc-dev
  Cc: Leonardo Augusto Guimaraes Garcia, Alex Williamson, kvm, David Gibson



On 13/11/2020 16:30, Andrew Donnellan wrote:
> On 13/11/20 4:06 pm, Alexey Kardashevskiy wrote:
>> We execute certain NPU2 setup code (such as mapping an LPID to a device
>> in NPU2) unconditionally if an Nvlink bridge is detected. However this
>> cannot succeed on P8+ machines as the init helpers return an error other
>> than ENODEV which means the device is there is and setup failed so
>> vfio_pci_enable() fails and pass through is not possible.
>>
>> This changes the two NPU2 related init helpers to return -ENODEV if
>> there is no "memory-region" device tree property as this is
>> the distinction between NPU and NPU2.
>>
>> Fixes: 7f92891778df ("vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] 
>> subdriver")
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> 
> Should this be Cc: stable?

This depends on whether P8+ + NVLink was ever a  product (hi Leonardo) 
and had actual customers who still rely on upstream kernels to work as 
after many years only the last week I heard form some Redhat test 
engineer that it does not work. May be cc: stable...


-- 
Alexey

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH kernel] vfio_pci_nvlink2: Do not attempt NPU2 setup on old P8's NPU
  2020-11-14  4:16   ` Alexey Kardashevskiy
@ 2020-11-16  6:20     ` Michael Ellerman
  0 siblings, 0 replies; 4+ messages in thread
From: Michael Ellerman @ 2020-11-16  6:20 UTC (permalink / raw)
  To: Alexey Kardashevskiy, Andrew Donnellan, linuxppc-dev
  Cc: Leonardo Augusto Guimaraes Garcia, Alex Williamson, kvm, David Gibson

Alexey Kardashevskiy <aik@ozlabs.ru> writes:
> On 13/11/2020 16:30, Andrew Donnellan wrote:
>> On 13/11/20 4:06 pm, Alexey Kardashevskiy wrote:
>>> We execute certain NPU2 setup code (such as mapping an LPID to a device
>>> in NPU2) unconditionally if an Nvlink bridge is detected. However this
>>> cannot succeed on P8+ machines as the init helpers return an error other
>>> than ENODEV which means the device is there is and setup failed so
>>> vfio_pci_enable() fails and pass through is not possible.
>>>
>>> This changes the two NPU2 related init helpers to return -ENODEV if
>>> there is no "memory-region" device tree property as this is
>>> the distinction between NPU and NPU2.
>>>
>>> Fixes: 7f92891778df ("vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] 
>>> subdriver")
>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> 
>> Should this be Cc: stable?
>
> This depends on whether P8+ + NVLink was ever a  product (hi Leonardo) 
> and had actual customers who still rely on upstream kernels to work as 
> after many years only the last week I heard form some Redhat test 
> engineer that it does not work. May be cc: stable...

I don't think it really matters if it was a product or not. Upstream is
never a product anyway.

If the fix is simple and unlikely to introduce a regression, and would
potentially save someone having to debug the problem again, then it
should get backported to stable.

You should also clarify what you mean by "P8+", it won't be clear to
most readers if you mean "Power 8 and/or later" or specifically Naples /
Power8 NVL.

cheers

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-11-16  6:22 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-13  5:06 [PATCH kernel] vfio_pci_nvlink2: Do not attempt NPU2 setup on old P8's NPU Alexey Kardashevskiy
2020-11-13  5:30 ` Andrew Donnellan
2020-11-14  4:16   ` Alexey Kardashevskiy
2020-11-16  6:20     ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).