All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] change the default Kconfig value of mlx5_en
@ 2017-04-21 19:45 Ian Kumlien
  2017-04-21 19:47 ` David Miller
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Kumlien @ 2017-04-21 19:45 UTC (permalink / raw)
  To: saeedm, Linux Kernel Network Developers

[-- Attachment #1: Type: text/plain, Size: 1620 bytes --]

Hi,

For some reason I spend some hours, two days in a row, trying to debug
why a newer
kernel didn't work on our machines. It worked just fine with the older kernel...

And there was no network interfaces to see or try to figure out what
was going on with.

Playing with the infiniband tools all i could see was things like:
...
state: 1: DOWN
phys state: 3: Disabled
cat: /sys/class/infiniband/mlx5_0/ports/1/rate: Invalid argument
rate: unknown
link_layer: Ethernet
...

It turns out that the kernel was compiled with mlx5_en disabled, since
it's the default

Unless there is a really good reason not to, lets change the default
value to 'y' =)

I'm hoping that this will lead to others not experiencing the same
surreal journey of
trying to debug this ;)

----

The mellanox driver supports both ethernet and infiniband, but it
is located in the ethernet drivers - the ethernet support should
default to 'yes'.

Signed-off-by: Ian Kumlien <ian.kumlien@gmail.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
index ddb4ca4ff930..206894f06dec 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
@@ -15,7 +15,7 @@ config MLX5_CORE_EN
  bool "Mellanox Technologies ConnectX-4 Ethernet support"
  depends on NETDEVICES && ETHERNET && PCI && MLX5_CORE
  imply PTP_1588_CLOCK
- default n
+ default y
  ---help---
   Ethernet support in Mellanox Technologies ConnectX-4 NIC.

-- 
2.12.2

[-- Attachment #2: 0001-Switch-mlx5_en-default-configiuration-value.patch --]
[-- Type: text/x-patch, Size: 1070 bytes --]

From 19bc8a18fe793177e753589ffd69992434f38348 Mon Sep 17 00:00:00 2001
From: Ian Kumlien <ian.kumlien@gmail.com>
Date: Fri, 21 Apr 2017 21:30:30 +0200
Subject: [PATCH] Switch mlx5_en default configiuration value

The mellanox driver supports both ethernet and infiniband, but it
is located in the ethernet drivers - the ethernet support should
default to 'yes'.

Signed-off-by: Ian Kumlien <ian.kumlien@gmail.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
index ddb4ca4ff930..206894f06dec 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
+++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig
@@ -15,7 +15,7 @@ config MLX5_CORE_EN
 	bool "Mellanox Technologies ConnectX-4 Ethernet support"
 	depends on NETDEVICES && ETHERNET && PCI && MLX5_CORE
 	imply PTP_1588_CLOCK
-	default n
+	default y
 	---help---
 	  Ethernet support in Mellanox Technologies ConnectX-4 NIC.
 
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC] change the default Kconfig value of mlx5_en
  2017-04-21 19:45 [RFC] change the default Kconfig value of mlx5_en Ian Kumlien
@ 2017-04-21 19:47 ` David Miller
  2017-04-21 19:51   ` Ian Kumlien
  0 siblings, 1 reply; 9+ messages in thread
From: David Miller @ 2017-04-21 19:47 UTC (permalink / raw)
  To: ian.kumlien; +Cc: saeedm, netdev

From: Ian Kumlien <ian.kumlien@gmail.com>
Date: Fri, 21 Apr 2017 21:45:00 +0200

> The mellanox driver supports both ethernet and infiniband, but it
> is located in the ethernet drivers - the ethernet support should
> default to 'yes'.

I don't have that card and I therefore perhaps don't want that driver
in my builds.

This is not an appropriate change, sorry.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] change the default Kconfig value of mlx5_en
  2017-04-21 19:47 ` David Miller
@ 2017-04-21 19:51   ` Ian Kumlien
  2017-04-21 23:10     ` Ian Kumlien
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Kumlien @ 2017-04-21 19:51 UTC (permalink / raw)
  To: David Miller; +Cc: saeedm, Linux Kernel Network Developers

On Fri, Apr 21, 2017 at 9:47 PM, David Miller <davem@davemloft.net> wrote:
> From: Ian Kumlien <ian.kumlien@gmail.com>
> Date: Fri, 21 Apr 2017 21:45:00 +0200
>
>> The mellanox driver supports both ethernet and infiniband, but it
>> is located in the ethernet drivers - the ethernet support should
>> default to 'yes'.
>
> I don't have that card and I therefore perhaps don't want that driver
> in my builds.

Ah, sorry - really tired here, got it in to my head that this was just
a flag for
the driver to build the ethernet module, sorry

> This is not an appropriate change, sorry.

Nope, completely agree... sorry for the noise

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] change the default Kconfig value of mlx5_en
  2017-04-21 19:51   ` Ian Kumlien
@ 2017-04-21 23:10     ` Ian Kumlien
  2017-04-22  0:34       ` Saeed Mahameed
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Kumlien @ 2017-04-21 23:10 UTC (permalink / raw)
  To: David Miller; +Cc: saeedm, Linux Kernel Network Developers

Sorry,

Back again, fighting cold, hot whiskey has been consumed...

Something like this would perhaps be a better solution:

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c
b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 60154a175bd3..fe192e247601 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -1139,6 +1139,10 @@ static int mlx5_load_one(struct mlx5_core_dev
*dev, struct mlx5_priv *priv,

 #ifdef CONFIG_MLX5_CORE_EN
        mlx5_eswitch_attach(dev->priv.eswitch);
+#else
+       if (MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_ETH) {
+               dev_info(&pdev->dev, "Ethernet device discovered but
support not enabled in kernel.");
+       }
 #endif

        err = mlx5_sriov_attach(dev);

This is in no way tested and just a thought for now - I suspect that a
better message would be required though...

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC] change the default Kconfig value of mlx5_en
  2017-04-21 23:10     ` Ian Kumlien
@ 2017-04-22  0:34       ` Saeed Mahameed
  2017-04-22  0:47         ` Ian Kumlien
  0 siblings, 1 reply; 9+ messages in thread
From: Saeed Mahameed @ 2017-04-22  0:34 UTC (permalink / raw)
  To: Ian Kumlien; +Cc: David Miller, Saeed Mahameed, Linux Kernel Network Developers

On Sat, Apr 22, 2017 at 2:10 AM, Ian Kumlien <ian.kumlien@gmail.com> wrote:
> Sorry,
>
> Back again, fighting cold, hot whiskey has been consumed...
>
> Something like this would perhaps be a better solution:
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c
> b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> index 60154a175bd3..fe192e247601 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> @@ -1139,6 +1139,10 @@ static int mlx5_load_one(struct mlx5_core_dev
> *dev, struct mlx5_priv *priv,
>
>  #ifdef CONFIG_MLX5_CORE_EN
>         mlx5_eswitch_attach(dev->priv.eswitch);
> +#else
> +       if (MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_ETH) {
> +               dev_info(&pdev->dev, "Ethernet device discovered but
> support not enabled in kernel.");
> +       }
>  #endif
>

Currently both MLX5_CORE=n and MLX5_CORE_EN=n as a default, the issue
you are seeing can occur only if you explicitly  set MLX5_CORE=y and
MLX5_CORE=n, Why would someone do this if he knows he wants Ethernet
support as well ? IMHO this print is redundant .

Anyway, Are you looking for RDMA support over ethernet (RoCE) ? and
you are not interested to have ethernet netdev support ?

if yes, I think this is something that can be achieved, but the
question is do we really need this ?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] change the default Kconfig value of mlx5_en
  2017-04-22  0:34       ` Saeed Mahameed
@ 2017-04-22  0:47         ` Ian Kumlien
  2017-04-22  1:07           ` Saeed Mahameed
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Kumlien @ 2017-04-22  0:47 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: David Miller, Saeed Mahameed, Linux Kernel Network Developers

On Sat, Apr 22, 2017 at 2:34 AM, Saeed Mahameed
<saeedm@dev.mellanox.co.il> wrote:
> On Sat, Apr 22, 2017 at 2:10 AM, Ian Kumlien <ian.kumlien@gmail.com> wrote:
>> Sorry,
>>
>> Back again, fighting cold, hot whiskey has been consumed...
>>
>> Something like this would perhaps be a better solution:
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c
>> b/drivers/net/ethernet/mellanox/mlx5/core/main.c
>> index 60154a175bd3..fe192e247601 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
>> @@ -1139,6 +1139,10 @@ static int mlx5_load_one(struct mlx5_core_dev
>> *dev, struct mlx5_priv *priv,
>>
>>  #ifdef CONFIG_MLX5_CORE_EN
>>         mlx5_eswitch_attach(dev->priv.eswitch);
>> +#else
>> +       if (MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_ETH) {
>> +               dev_info(&pdev->dev, "Ethernet device discovered but
>> support not enabled in kernel.");
>> +       }
>>  #endif
>>
>
> Currently both MLX5_CORE=n and MLX5_CORE_EN=n as a default, the issue
> you are seeing can occur only if you explicitly  set MLX5_CORE=y and
> MLX5_CORE=n, Why would someone do this if he knows he wants Ethernet
> support as well ? IMHO this print is redundant .

Well, I'm running a prebuilt kernel - which was configured this way,
and since there
is no mlx5_en module and it does state that the link is "Ethernet", it
just looks like the
driver is broken or in some kind of really weird state.

> Anyway, Are you looking for RDMA support over ethernet (RoCE) ? and
> you are not interested to have ethernet netdev support ?

? RDMA is something we'll look at in the future, right now, having the
nics actually
work as nics is a priority ;)

> if yes, I think this is something that can be achieved, but the
> question is do we really need this ?

It's really weird to see the driver load, to see everything register
and have no feedback.

Including no network devices, but if you run the Infiniband commands,
they tell you that
you are connected to Ethernet but that the device is down and disabled.

To me, down and disabled is not the same as in "Ethernet support is
not included" =)

Basically, i would hate for someone else to end up in the same
situation since you only
get guides on how to enable infiniband/RDMA but what you really want
to do at that point
is to disable it and see if that gives you your network devices back =)

I have had similar issues with some connectx3 devices while playing at
home but i suspect
that it's just a limitation of OFED packages available for the dist I'm running.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] change the default Kconfig value of mlx5_en
  2017-04-22  0:47         ` Ian Kumlien
@ 2017-04-22  1:07           ` Saeed Mahameed
  2017-04-22  9:28             ` Ian Kumlien
  0 siblings, 1 reply; 9+ messages in thread
From: Saeed Mahameed @ 2017-04-22  1:07 UTC (permalink / raw)
  To: Ian Kumlien, Leon Romanovsky, Matan Barak
  Cc: David Miller, Saeed Mahameed, Linux Kernel Network Developers

On Sat, Apr 22, 2017 at 3:47 AM, Ian Kumlien <ian.kumlien@gmail.com> wrote:
> On Sat, Apr 22, 2017 at 2:34 AM, Saeed Mahameed
> <saeedm@dev.mellanox.co.il> wrote:
>> On Sat, Apr 22, 2017 at 2:10 AM, Ian Kumlien <ian.kumlien@gmail.com> wrote:
>>> Sorry,
>>>
>>> Back again, fighting cold, hot whiskey has been consumed...
>>>
>>> Something like this would perhaps be a better solution:
>>>
>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c
>>> b/drivers/net/ethernet/mellanox/mlx5/core/main.c
>>> index 60154a175bd3..fe192e247601 100644
>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
>>> @@ -1139,6 +1139,10 @@ static int mlx5_load_one(struct mlx5_core_dev
>>> *dev, struct mlx5_priv *priv,
>>>
>>>  #ifdef CONFIG_MLX5_CORE_EN
>>>         mlx5_eswitch_attach(dev->priv.eswitch);
>>> +#else
>>> +       if (MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_ETH) {
>>> +               dev_info(&pdev->dev, "Ethernet device discovered but
>>> support not enabled in kernel.");
>>> +       }
>>>  #endif
>>>
>>
>> Currently both MLX5_CORE=n and MLX5_CORE_EN=n as a default, the issue
>> you are seeing can occur only if you explicitly  set MLX5_CORE=y and
>> MLX5_CORE=n, Why would someone do this if he knows he wants Ethernet
>> support as well ? IMHO this print is redundant .
>
> Well, I'm running a prebuilt kernel - which was configured this way,
> and since there
> is no mlx5_en module and it does state that the link is "Ethernet", it
> just looks like the
> driver is broken or in some kind of really weird state.
>
>> Anyway, Are you looking for RDMA support over ethernet (RoCE) ? and
>> you are not interested to have ethernet netdev support ?
>
> ? RDMA is something we'll look at in the future, right now, having the
> nics actually
> work as nics is a priority ;)
>

I see, i just wanted to understand your situation :)

>> if yes, I think this is something that can be achieved, but the
>> question is do we really need this ?
>
> It's really weird to see the driver load, to see everything register
> and have no feedback.
>

So, in your case you have mlx5 core support without MLX5_CORE_EN which
provides the eswitch and netdev functionality in ethernet.

But you will still have mlx5_ib register an RDMA interface and
theoretically it should work, the only thing you won't see is a
netdevice.

The weird thing is that you don't see a link up on the RDMA interface,
Leon/Matan can you please look into this ? do we really need a netdev
to have a functioning RDMA logical link in ethernet ?

> Including no network devices, but if you run the Infiniband commands,
> they tell you that
> you are connected to Ethernet but that the device is down and disabled.
>
> To me, down and disabled is not the same as in "Ethernet support is
> not included" =)
>
> Basically, i would hate for someone else to end up in the same
> situation since you only
> get guides on how to enable infiniband/RDMA but what you really want
> to do at that point
> is to disable it and see if that gives you your network devices back =)
>

Yes this is misleading, Maybe your kernel log warning is not so bad
after all, but let me dig more into this.
I will get back to you next week.

> I have had similar issues with some connectx3 devices while playing at
> home but i suspect
> that it's just a limitation of OFED packages available for the dist I'm running.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] change the default Kconfig value of mlx5_en
  2017-04-22  1:07           ` Saeed Mahameed
@ 2017-04-22  9:28             ` Ian Kumlien
  2017-04-23  7:54               ` Matan Barak
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Kumlien @ 2017-04-22  9:28 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Leon Romanovsky, Matan Barak, David Miller, Saeed Mahameed,
	Linux Kernel Network Developers

On Sat, Apr 22, 2017 at 3:07 AM, Saeed Mahameed
<saeedm@dev.mellanox.co.il> wrote:
> On Sat, Apr 22, 2017 at 3:47 AM, Ian Kumlien <ian.kumlien@gmail.com> wrote:
>> On Sat, Apr 22, 2017 at 2:34 AM, Saeed Mahameed
>> <saeedm@dev.mellanox.co.il> wrote:
>>> On Sat, Apr 22, 2017 at 2:10 AM, Ian Kumlien <ian.kumlien@gmail.com> wrote:
>>>> Sorry,
>>>>
>>>> Back again, fighting cold, hot whiskey has been consumed...
>>>>
>>>> Something like this would perhaps be a better solution:
>>>>
>>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c
>>>> b/drivers/net/ethernet/mellanox/mlx5/core/main.c
>>>> index 60154a175bd3..fe192e247601 100644
>>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
>>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
>>>> @@ -1139,6 +1139,10 @@ static int mlx5_load_one(struct mlx5_core_dev
>>>> *dev, struct mlx5_priv *priv,
>>>>
>>>>  #ifdef CONFIG_MLX5_CORE_EN
>>>>         mlx5_eswitch_attach(dev->priv.eswitch);
>>>> +#else
>>>> +       if (MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_ETH) {
>>>> +               dev_info(&pdev->dev, "Ethernet device discovered but
>>>> support not enabled in kernel.");
>>>> +       }
>>>>  #endif
>>>>
>>>
>>> Currently both MLX5_CORE=n and MLX5_CORE_EN=n as a default, the issue
>>> you are seeing can occur only if you explicitly  set MLX5_CORE=y and
>>> MLX5_CORE=n, Why would someone do this if he knows he wants Ethernet
>>> support as well ? IMHO this print is redundant .
>>
>> Well, I'm running a prebuilt kernel - which was configured this way,
>> and since there
>> is no mlx5_en module and it does state that the link is "Ethernet", it
>> just looks like the
>> driver is broken or in some kind of really weird state.
>>
>>> Anyway, Are you looking for RDMA support over ethernet (RoCE) ? and
>>> you are not interested to have ethernet netdev support ?
>>
>> ? RDMA is something we'll look at in the future, right now, having the
>> nics actually
>> work as nics is a priority ;)
>>
>
> I see, i just wanted to understand your situation :)
>
>>> if yes, I think this is something that can be achieved, but the
>>> question is do we really need this ?
>>
>> It's really weird to see the driver load, to see everything register
>> and have no feedback.
>>
>
> So, in your case you have mlx5 core support without MLX5_CORE_EN which
> provides the eswitch and netdev functionality in ethernet.

Yes

> But you will still have mlx5_ib register an RDMA interface and
> theoretically it should work, the only thing you won't see is a
> netdevice.
>
> The weird thing is that you don't see a link up on the RDMA interface,
> Leon/Matan can you please look into this ? do we really need a netdev
> to have a functioning RDMA logical link in ethernet ?

The switch we have does support RDMA but the manual is sparse (as in
nothing really there) wrt enabling/configuring the RDMA bit so something
might be missing.

I'll try to remember to do the same test when we setup the mellanox switches =)

>> Including no network devices, but if you run the Infiniband commands,
>> they tell you that
>> you are connected to Ethernet but that the device is down and disabled.
>>
>> To me, down and disabled is not the same as in "Ethernet support is
>> not included" =)
>>
>> Basically, i would hate for someone else to end up in the same
>> situation since you only
>> get guides on how to enable infiniband/RDMA but what you really want
>> to do at that point
>> is to disable it and see if that gives you your network devices back =)
>>
>
> Yes this is misleading, Maybe your kernel log warning is not so bad
> after all, but let me dig more into this.
> I will get back to you next week.

Thanks, I bet that there is better ways to do it, this one was just
one of the first ones i found =)

>> I have had similar issues with some connectx3 devices while playing at
>> home but i suspect
>> that it's just a limitation of OFED packages available for the dist I'm running.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] change the default Kconfig value of mlx5_en
  2017-04-22  9:28             ` Ian Kumlien
@ 2017-04-23  7:54               ` Matan Barak
  0 siblings, 0 replies; 9+ messages in thread
From: Matan Barak @ 2017-04-23  7:54 UTC (permalink / raw)
  To: Ian Kumlien, Saeed Mahameed
  Cc: Leon Romanovsky, David Miller, Saeed Mahameed,
	Linux Kernel Network Developers

On 22/04/2017 12:28, Ian Kumlien wrote:
> On Sat, Apr 22, 2017 at 3:07 AM, Saeed Mahameed
> <saeedm@dev.mellanox.co.il> wrote:
>> On Sat, Apr 22, 2017 at 3:47 AM, Ian Kumlien <ian.kumlien@gmail.com> wrote:
>>> On Sat, Apr 22, 2017 at 2:34 AM, Saeed Mahameed
>>> <saeedm@dev.mellanox.co.il> wrote:
>>>> On Sat, Apr 22, 2017 at 2:10 AM, Ian Kumlien <ian.kumlien@gmail.com> wrote:
>>>>> Sorry,
>>>>>
>>>>> Back again, fighting cold, hot whiskey has been consumed...
>>>>>
>>>>> Something like this would perhaps be a better solution:
>>>>>
>>>>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c
>>>>> b/drivers/net/ethernet/mellanox/mlx5/core/main.c
>>>>> index 60154a175bd3..fe192e247601 100644
>>>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
>>>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
>>>>> @@ -1139,6 +1139,10 @@ static int mlx5_load_one(struct mlx5_core_dev
>>>>> *dev, struct mlx5_priv *priv,
>>>>>
>>>>>  #ifdef CONFIG_MLX5_CORE_EN
>>>>>         mlx5_eswitch_attach(dev->priv.eswitch);
>>>>> +#else
>>>>> +       if (MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_ETH) {
>>>>> +               dev_info(&pdev->dev, "Ethernet device discovered but
>>>>> support not enabled in kernel.");
>>>>> +       }
>>>>>  #endif
>>>>>
>>>>
>>>> Currently both MLX5_CORE=n and MLX5_CORE_EN=n as a default, the issue
>>>> you are seeing can occur only if you explicitly  set MLX5_CORE=y and
>>>> MLX5_CORE=n, Why would someone do this if he knows he wants Ethernet
>>>> support as well ? IMHO this print is redundant .
>>>
>>> Well, I'm running a prebuilt kernel - which was configured this way,
>>> and since there
>>> is no mlx5_en module and it does state that the link is "Ethernet", it
>>> just looks like the
>>> driver is broken or in some kind of really weird state.
>>>
>>>> Anyway, Are you looking for RDMA support over ethernet (RoCE) ? and
>>>> you are not interested to have ethernet netdev support ?
>>>
>>> ? RDMA is something we'll look at in the future, right now, having the
>>> nics actually
>>> work as nics is a priority ;)
>>>
>>
>> I see, i just wanted to understand your situation :)
>>
>>>> if yes, I think this is something that can be achieved, but the
>>>> question is do we really need this ?
>>>
>>> It's really weird to see the driver load, to see everything register
>>> and have no feedback.
>>>
>>
>> So, in your case you have mlx5 core support without MLX5_CORE_EN which
>> provides the eswitch and netdev functionality in ethernet.
>
> Yes
>
>> But you will still have mlx5_ib register an RDMA interface and
>> theoretically it should work, the only thing you won't see is a
>> netdevice.
>>
>> The weird thing is that you don't see a link up on the RDMA interface,
>> Leon/Matan can you please look into this ? do we really need a netdev
>> to have a functioning RDMA logical link in ethernet ?
>

The RDMA core subsystem listens to netdev events and configure RoCE GIDs
accordingly. It currently relies on a RoCE dev to have an associated 
netdev, as even default GIDs (the equivalent to IPv6 link local GIDs) 
relies on a MAC address that comes from the netdev.
In IB NICs, the case is different. There's no associated netdev for 
that, so compiling MLX5_CORE_EN isn't required.

> The switch we have does support RDMA but the manual is sparse (as in
> nothing really there) wrt enabling/configuring the RDMA bit so something
> might be missing.
>
> I'll try to remember to do the same test when we setup the mellanox switches =)
>
>>> Including no network devices, but if you run the Infiniband commands,
>>> they tell you that
>>> you are connected to Ethernet but that the device is down and disabled.
>>>
>>> To me, down and disabled is not the same as in "Ethernet support is
>>> not included" =)
>>>
>>> Basically, i would hate for someone else to end up in the same
>>> situation since you only
>>> get guides on how to enable infiniband/RDMA but what you really want
>>> to do at that point
>>> is to disable it and see if that gives you your network devices back =)
>>>
>>
>> Yes this is misleading, Maybe your kernel log warning is not so bad
>> after all, but let me dig more into this.
>> I will get back to you next week.
>
> Thanks, I bet that there is better ways to do it, this one was just
> one of the first ones i found =)
>
>>> I have had similar issues with some connectx3 devices while playing at
>>> home but i suspect
>>> that it's just a limitation of OFED packages available for the dist I'm running.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-04-23  7:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-21 19:45 [RFC] change the default Kconfig value of mlx5_en Ian Kumlien
2017-04-21 19:47 ` David Miller
2017-04-21 19:51   ` Ian Kumlien
2017-04-21 23:10     ` Ian Kumlien
2017-04-22  0:34       ` Saeed Mahameed
2017-04-22  0:47         ` Ian Kumlien
2017-04-22  1:07           ` Saeed Mahameed
2017-04-22  9:28             ` Ian Kumlien
2017-04-23  7:54               ` Matan Barak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.