linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* SRIOV virtual functions with Linux "inbox" opensm
@ 2024-04-22 10:09 Ewen Chan
  2024-04-30 12:03 ` Leon Romanovsky
  0 siblings, 1 reply; 6+ messages in thread
From: Ewen Chan @ 2024-04-22 10:09 UTC (permalink / raw)
  To: linux-rdma

To Whom It May Concern:

I am using a few Mellanox ConnectX-4 100 Gbps Infiniband NIC that's connected together via a Mellanox MSB7890 externally managed switch.

I have a dual Xeon E5-2697A v4, Proxmox 7.4-17 (Debian 11) server that's running opensm, along with two AMD Ryzen 5950X compute nodes, that also have the ConnectX-4 in them, running Proxmox 7.4-17 as well.

I have enabled SR-IOV on all three systems, and all three systems have 8 virtual functions for said ConnectX-4.

I read in the Nvidia/Mellanox documentation that I would need to add the parameter "virt_enabled 2" to /etc/opensm/opensm.conf so that the OpenSM subnet manager will know that virtual functions are enabled, but it would appear that the opensm that ships with Debian 11/linux-rdma, either ignores that option or doesn't know what to do with it.

I would prefer NOT to install the MLNX_OFED drivers for Debian (11) if I can avoid it.

My two questions are how do I get the linux opensm to:

    Recognise that I am using virtual functions (so that it would understand that there are multiple traffic streams coming over the wire, via one physical port)?

    Automatically assign the Node GUID and Port GUID so that I don't have to set those manually.

    (I've set the Node GUID and Port GUID on the my Ryzen compute node host already, and I can see the Node GUID and Port GUID inside my CentOS 7.7.1908 VM (which I've updated to use the 5.4.247 kernel), but it is still showing "Port 1, State: Down".)


Your help is greatly appreciated.

Thank you.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: SRIOV virtual functions with Linux "inbox" opensm
  2024-04-22 10:09 SRIOV virtual functions with Linux "inbox" opensm Ewen Chan
@ 2024-04-30 12:03 ` Leon Romanovsky
  2024-04-30 17:10   ` Ewen Chan
  2024-05-01  4:43   ` Ewen Chan
  0 siblings, 2 replies; 6+ messages in thread
From: Leon Romanovsky @ 2024-04-30 12:03 UTC (permalink / raw)
  To: Ewen Chan; +Cc: linux-rdma

On Mon, Apr 22, 2024 at 10:09:15AM +0000, Ewen Chan wrote:
> To Whom It May Concern:
> 
> I am using a few Mellanox ConnectX-4 100 Gbps Infiniband NIC that's connected together via a Mellanox MSB7890 externally managed switch.
> 
> I have a dual Xeon E5-2697A v4, Proxmox 7.4-17 (Debian 11) server that's running opensm, along with two AMD Ryzen 5950X compute nodes, that also have the ConnectX-4 in them, running Proxmox 7.4-17 as well.
> 
> I have enabled SR-IOV on all three systems, and all three systems have 8 virtual functions for said ConnectX-4.
> 
> I read in the Nvidia/Mellanox documentation that I would need to add the parameter "virt_enabled 2" to /etc/opensm/opensm.conf so that the OpenSM subnet manager will know that virtual functions are enabled, but it would appear that the opensm that ships with Debian 11/linux-rdma, either ignores that option or doesn't know what to do with it.
> 
> I would prefer NOT to install the MLNX_OFED drivers for Debian (11) if I can avoid it.
> 
> My two questions are how do I get the linux opensm to:
> 
>     Recognise that I am using virtual functions (so that it would understand that there are multiple traffic streams coming over the wire, via one physical port)?
> 
>     Automatically assign the Node GUID and Port GUID so that I don't have to set those manually.
> 
>     (I've set the Node GUID and Port GUID on the my Ryzen compute node host already, and I can see the Node GUID and Port GUID inside my CentOS 7.7.1908 VM (which I've updated to use the 5.4.247 kernel), but it is still showing "Port 1, State: Down".)
> 
> 
> Your help is greatly appreciated.

Linux opensm is not supporting ConnectX4+ SRIOV.

You can install SM RPM from NVIDIA Web to enable ConnectX4 SRIOV.
https://network.nvidia.com/products/adapter-software/infiniband-management-and-monitoring-tools/

Thanks

> 
> Thank you.
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: SRIOV virtual functions with Linux "inbox" opensm
  2024-04-30 12:03 ` Leon Romanovsky
@ 2024-04-30 17:10   ` Ewen Chan
  2024-05-01  4:43   ` Ewen Chan
  1 sibling, 0 replies; 6+ messages in thread
From: Ewen Chan @ 2024-04-30 17:10 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: linux-rdma

Leon:

Your help is greatly appreciated.

Thank you.




From: Leon Romanovsky <leon@kernel.org>
Sent: April 30, 2024 5:03 AM
To: Ewen Chan <alpha754293@hotmail.com>
Cc: linux-rdma@vger.kernel.org <linux-rdma@vger.kernel.org>
Subject: Re: SRIOV virtual functions with Linux "inbox" opensm
 
On Mon, Apr 22, 2024 at 10:09:15AM +0000, Ewen Chan wrote:
> To Whom It May Concern:
>
> I am using a few Mellanox ConnectX-4 100 Gbps Infiniband NIC that's connected together via a Mellanox MSB7890 externally managed switch.
>
> I have a dual Xeon E5-2697A v4, Proxmox 7.4-17 (Debian 11) server that's running opensm, along with two AMD Ryzen 5950X compute nodes, that also have the ConnectX-4 in them, running Proxmox 7.4-17 as well.
>
> I have enabled SR-IOV on all three systems, and all three systems have 8 virtual functions for said ConnectX-4.
>
> I read in the Nvidia/Mellanox documentation that I would need to add the parameter "virt_enabled 2" to /etc/opensm/opensm.conf so that the OpenSM subnet manager will know that virtual functions are enabled, but it would appear that the opensm that ships with Debian 11/linux-rdma, either ignores that option or doesn't know what to do with it.
>
> I would prefer NOT to install the MLNX_OFED drivers for Debian (11) if I can avoid it.
>
> My two questions are how do I get the linux opensm to:
>
>     Recognise that I am using virtual functions (so that it would understand that there are multiple traffic streams coming over the wire, via one physical port)?
>
>     Automatically assign the Node GUID and Port GUID so that I don't have to set those manually.
>
>     (I've set the Node GUID and Port GUID on the my Ryzen compute node host already, and I can see the Node GUID and Port GUID inside my CentOS 7.7.1908 VM (which I've updated to use the 5.4.247 kernel), but it is still showing "Port 1, State: Down".)
>
>
> Your help is greatly appreciated.

Linux opensm is not supporting ConnectX4+ SRIOV.

You can install SM RPM from NVIDIA Web to enable ConnectX4 SRIOV.
https://network.nvidia.com/products/adapter-software/infiniband-management-and-monitoring-tools/

Thanks

>
> Thank you.
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: SRIOV virtual functions with Linux "inbox" opensm
  2024-04-30 12:03 ` Leon Romanovsky
  2024-04-30 17:10   ` Ewen Chan
@ 2024-05-01  4:43   ` Ewen Chan
  2024-05-02 16:54     ` Leon Romanovsky
  1 sibling, 1 reply; 6+ messages in thread
From: Ewen Chan @ 2024-05-01  4:43 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: linux-rdma

Leon:

I see in the link that you provided that the IB management packages are only available for Redhat.

Would it be safe to assume that there are no Debian equivalent to these packages?

Your help is greatly appreciated.

Thank you.




From: Leon Romanovsky <leon@kernel.org>
Sent: April 30, 2024 8:03 AM
To: Ewen Chan <alpha754293@hotmail.com>
Cc: linux-rdma@vger.kernel.org <linux-rdma@vger.kernel.org>
Subject: Re: SRIOV virtual functions with Linux "inbox" opensm
 
On Mon, Apr 22, 2024 at 10:09:15AM +0000, Ewen Chan wrote:
> To Whom It May Concern:
>
> I am using a few Mellanox ConnectX-4 100 Gbps Infiniband NIC that's connected together via a Mellanox MSB7890 externally managed switch.
>
> I have a dual Xeon E5-2697A v4, Proxmox 7.4-17 (Debian 11) server that's running opensm, along with two AMD Ryzen 5950X compute nodes, that also have the ConnectX-4 in them, running Proxmox 7.4-17 as well.
>
> I have enabled SR-IOV on all three systems, and all three systems have 8 virtual functions for said ConnectX-4.
>
> I read in the Nvidia/Mellanox documentation that I would need to add the parameter "virt_enabled 2" to /etc/opensm/opensm.conf so that the OpenSM subnet manager will know that virtual functions are enabled, but it would appear that the opensm that ships with Debian 11/linux-rdma, either ignores that option or doesn't know what to do with it.
>
> I would prefer NOT to install the MLNX_OFED drivers for Debian (11) if I can avoid it.
>
> My two questions are how do I get the linux opensm to:
>
>     Recognise that I am using virtual functions (so that it would understand that there are multiple traffic streams coming over the wire, via one physical port)?
>
>     Automatically assign the Node GUID and Port GUID so that I don't have to set those manually.
>
>     (I've set the Node GUID and Port GUID on the my Ryzen compute node host already, and I can see the Node GUID and Port GUID inside my CentOS 7.7.1908 VM (which I've updated to use the 5.4.247 kernel), but it is still showing "Port 1, State: Down".)
>
>
> Your help is greatly appreciated.

Linux opensm is not supporting ConnectX4+ SRIOV.

You can install SM RPM from NVIDIA Web to enable ConnectX4 SRIOV.
https://network.nvidia.com/products/adapter-software/infiniband-management-and-monitoring-tools/

Thanks

>
> Thank you.
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: SRIOV virtual functions with Linux "inbox" opensm
  2024-05-01  4:43   ` Ewen Chan
@ 2024-05-02 16:54     ` Leon Romanovsky
  2024-05-02 19:18       ` Ewen Chan
  0 siblings, 1 reply; 6+ messages in thread
From: Leon Romanovsky @ 2024-05-02 16:54 UTC (permalink / raw)
  To: Ewen Chan; +Cc: linux-rdma

On Wed, May 01, 2024 at 04:43:39AM +0000, Ewen Chan wrote:
> Leon:
> 
> I see in the link that you provided that the IB management packages are only available for Redhat.
> 
> Would it be safe to assume that there are no Debian equivalent to these packages?

Right, at the moment, there is no Debian equivalent to these packages.

Thanks

> 
> Your help is greatly appreciated.
> 
> Thank you.
> 
> 
> 
> 
> From: Leon Romanovsky <leon@kernel.org>
> Sent: April 30, 2024 8:03 AM
> To: Ewen Chan <alpha754293@hotmail.com>
> Cc: linux-rdma@vger.kernel.org <linux-rdma@vger.kernel.org>
> Subject: Re: SRIOV virtual functions with Linux "inbox" opensm
>  
> On Mon, Apr 22, 2024 at 10:09:15AM +0000, Ewen Chan wrote:
> > To Whom It May Concern:
> >
> > I am using a few Mellanox ConnectX-4 100 Gbps Infiniband NIC that's connected together via a Mellanox MSB7890 externally managed switch.
> >
> > I have a dual Xeon E5-2697A v4, Proxmox 7.4-17 (Debian 11) server that's running opensm, along with two AMD Ryzen 5950X compute nodes, that also have the ConnectX-4 in them, running Proxmox 7.4-17 as well.
> >
> > I have enabled SR-IOV on all three systems, and all three systems have 8 virtual functions for said ConnectX-4.
> >
> > I read in the Nvidia/Mellanox documentation that I would need to add the parameter "virt_enabled 2" to /etc/opensm/opensm.conf so that the OpenSM subnet manager will know that virtual functions are enabled, but it would appear that the opensm that ships with Debian 11/linux-rdma, either ignores that option or doesn't know what to do with it.
> >
> > I would prefer NOT to install the MLNX_OFED drivers for Debian (11) if I can avoid it.
> >
> > My two questions are how do I get the linux opensm to:
> >
> >     Recognise that I am using virtual functions (so that it would understand that there are multiple traffic streams coming over the wire, via one physical port)?
> >
> >     Automatically assign the Node GUID and Port GUID so that I don't have to set those manually.
> >
> >     (I've set the Node GUID and Port GUID on the my Ryzen compute node host already, and I can see the Node GUID and Port GUID inside my CentOS 7.7.1908 VM (which I've updated to use the 5.4.247 kernel), but it is still showing "Port 1, State: Down".)
> >
> >
> > Your help is greatly appreciated.
> 
> Linux opensm is not supporting ConnectX4+ SRIOV.
> 
> You can install SM RPM from NVIDIA Web to enable ConnectX4 SRIOV.
> https://network.nvidia.com/products/adapter-software/infiniband-management-and-monitoring-tools/
> 
> Thanks
> 
> >
> > Thank you.
> >

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: SRIOV virtual functions with Linux "inbox" opensm
  2024-05-02 16:54     ` Leon Romanovsky
@ 2024-05-02 19:18       ` Ewen Chan
  0 siblings, 0 replies; 6+ messages in thread
From: Ewen Chan @ 2024-05-02 19:18 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: linux-rdma

Leon:

Thank you.

Your help is greatly appreciated.

________________________________________
From: Leon Romanovsky <leon@kernel.org>
Sent: May 2, 2024 9:54 AM
To: Ewen Chan <alpha754293@hotmail.com>
Cc: linux-rdma@vger.kernel.org <linux-rdma@vger.kernel.org>
Subject: Re: SRIOV virtual functions with Linux "inbox" opensm
 
On Wed, May 01, 2024 at 04:43:39AM +0000, Ewen Chan wrote:
> Leon:
>
> I see in the link that you provided that the IB management packages are only available for Redhat.
>
> Would it be safe to assume that there are no Debian equivalent to these packages?

Right, at the moment, there is no Debian equivalent to these packages.

Thanks

>
> Your help is greatly appreciated.
>
> Thank you.
>
>
>
>
> From: Leon Romanovsky <leon@kernel.org>
> Sent: April 30, 2024 8:03 AM
> To: Ewen Chan <alpha754293@hotmail.com>
> Cc: linux-rdma@vger.kernel.org <linux-rdma@vger.kernel.org>
> Subject: Re: SRIOV virtual functions with Linux "inbox" opensm
>  
> On Mon, Apr 22, 2024 at 10:09:15AM +0000, Ewen Chan wrote:
> > To Whom It May Concern:
> >
> > I am using a few Mellanox ConnectX-4 100 Gbps Infiniband NIC that's connected together via a Mellanox MSB7890 externally managed switch.
> >
> > I have a dual Xeon E5-2697A v4, Proxmox 7.4-17 (Debian 11) server that's running opensm, along with two AMD Ryzen 5950X compute nodes, that also have the ConnectX-4 in them, running Proxmox 7.4-17 as well.
> >
> > I have enabled SR-IOV on all three systems, and all three systems have 8 virtual functions for said ConnectX-4.
> >
> > I read in the Nvidia/Mellanox documentation that I would need to add the parameter "virt_enabled 2" to /etc/opensm/opensm.conf so that the OpenSM subnet manager will know that virtual functions are enabled, but it would appear that the opensm that ships with Debian 11/linux-rdma, either ignores that option or doesn't know what to do with it.
> >
> > I would prefer NOT to install the MLNX_OFED drivers for Debian (11) if I can avoid it.
> >
> > My two questions are how do I get the linux opensm to:
> >
> >     Recognise that I am using virtual functions (so that it would understand that there are multiple traffic streams coming over the wire, via one physical port)?
> >
> >     Automatically assign the Node GUID and Port GUID so that I don't have to set those manually.
> >
> >     (I've set the Node GUID and Port GUID on the my Ryzen compute node host already, and I can see the Node GUID and Port GUID inside my CentOS 7.7.1908 VM (which I've updated to use the 5.4.247 kernel), but it is still showing "Port 1, State: Down".)
> >
> >
> > Your help is greatly appreciated.
>
> Linux opensm is not supporting ConnectX4+ SRIOV.
>
> You can install SM RPM from NVIDIA Web to enable ConnectX4 SRIOV.
> https://network.nvidia.com/products/adapter-software/infiniband-management-and-monitoring-tools/
>
> Thanks
>
> >
> > Thank you.
> >

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-05-02 19:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-22 10:09 SRIOV virtual functions with Linux "inbox" opensm Ewen Chan
2024-04-30 12:03 ` Leon Romanovsky
2024-04-30 17:10   ` Ewen Chan
2024-05-01  4:43   ` Ewen Chan
2024-05-02 16:54     ` Leon Romanovsky
2024-05-02 19:18       ` Ewen Chan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).