All of lore.kernel.org
 help / color / mirror / Atom feed
* mlx4: kernel 3.4-rc1 breaks libumad
@ 2012-04-02  7:42 Bart Van Assche
       [not found] ` <4F795880.4070306-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2012-04-02  7:42 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA

Hi,

Apparently applications based on libumad can find local ports with
kernel 3.2.x but not with kernel 3.4-rc1.

# uname -r
3.4.0-rc1
# ls /sys/class/infiniband/mlx4_0/ports/1/rate
/sys/class/infiniband/mlx4_0/ports/1/rate
# cat /sys/class/infiniband/mlx4_0/ports/1/rate
cat: /sys/class/infiniband/mlx4_0/ports/1/rate: Invalid argument

This breaks libumad because this makes sys_read_string() fail and hence
also get_port().

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: mlx4: kernel 3.4-rc1 breaks libumad
       [not found] ` <4F795880.4070306-HInyCGIudOg@public.gmane.org>
@ 2012-04-02 10:33   ` Or Gerlitz
       [not found]     ` <4F798069.4030305-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Or Gerlitz @ 2012-04-02 10:33 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 4/2/2012 10:42 AM, Bart Van Assche wrote:
> # uname -r
> 3.4.0-rc1
> # ls /sys/class/infiniband/mlx4_0/ports/1/rate
> /sys/class/infiniband/mlx4_0/ports/1/rate
> # cat /sys/class/infiniband/mlx4_0/ports/1/rate
> cat: /sys/class/infiniband/mlx4_0/ports/1/rate: Invalid argument
>

seems that this happens when the link layer is "wrong" e.g does 
/sys/class/infiniband/mlx4_0/ports/1/link_layer shows the actual link 
which is your hca port 1 is connected through? looking on this.

Or.

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: mlx4: kernel 3.4-rc1 breaks libumad
       [not found]     ` <4F798069.4030305-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2012-04-02 11:16       ` Bart Van Assche
       [not found]         ` <4F798A9B.7060805-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2012-04-02 11:16 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 04/02/12 10:33, Or Gerlitz wrote:

> On 4/2/2012 10:42 AM, Bart Van Assche wrote:
>> # uname -r
>> 3.4.0-rc1
>> # ls /sys/class/infiniband/mlx4_0/ports/1/rate
>> /sys/class/infiniband/mlx4_0/ports/1/rate
>> # cat /sys/class/infiniband/mlx4_0/ports/1/rate
>> cat: /sys/class/infiniband/mlx4_0/ports/1/rate: Invalid argument
> 
> seems that this happens when the link layer is "wrong" e.g does
> /sys/class/infiniband/mlx4_0/ports/1/link_layer shows the actual link
> which is your hca port 1 is connected through? looking on this.


As far as I can see the link layer value is fine:
$ cat /sys/class/infiniband/mlx4_0/ports/1/link_layer
InfiniBand
$ cat /sys/class/infiniband/mlx4_0/ports/2/link_layer
InfiniBand

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: mlx4: kernel 3.4-rc1 breaks libumad
       [not found]         ` <4F798A9B.7060805-HInyCGIudOg@public.gmane.org>
@ 2012-04-02 11:20           ` Or Gerlitz
       [not found]             ` <4F798B9A.6090309-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Or Gerlitz @ 2012-04-02 11:20 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 4/2/2012 2:16 PM, Bart Van Assche wrote:
> On 04/02/12 10:33, Or Gerlitz wrote:
>
>
> As far as I can see the link layer value is fine:
> $ cat /sys/class/infiniband/mlx4_0/ports/1/link_layer
> InfiniBand
> $ cat /sys/class/infiniband/mlx4_0/ports/2/link_layer
> InfiniBand
>

So the two ports are actually connected to IB switch? please send 
ibv_devinfo output (HCA, firmware, etc).

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: mlx4: kernel 3.4-rc1 breaks libumad
       [not found]             ` <4F798B9A.6090309-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2012-04-02 11:48               ` Bart Van Assche
       [not found]                 ` <4F799222.3050306-HInyCGIudOg@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2012-04-02 11:48 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 04/02/12 11:20, Or Gerlitz wrote:

> On 4/2/2012 2:16 PM, Bart Van Assche wrote:
>> On 04/02/12 10:33, Or Gerlitz wrote:
>>
>> As far as I can see the link layer value is fine:
>> $ cat /sys/class/infiniband/mlx4_0/ports/1/link_layer
>> InfiniBand
>> $ cat /sys/class/infiniband/mlx4_0/ports/2/link_layer
>> InfiniBand
> 
> So the two ports are actually connected to IB switch? please send
> ibv_devinfo output (HCA, firmware, etc).


The two ports are connected back-to-back to another mlx4 HCA. I noticed this behavior change since opensm stopped working after rebooting into 3.4-rc1. ibv_devinfo output is as follows:

# ibv_devinfo -v
hca_id: mlx4_0
        transport:                      InfiniBand (0)
        fw_ver:                         2.9.1000
        node_guid:                      0002:c903:0005:f34e
        sys_image_guid:                 0002:c903:0005:f351
        vendor_id:                      0x02c9
        vendor_part_id:                 26428
        hw_ver:                         0xA0
        board_id:                       MT_0BB0120003
        phys_port_cnt:                  2
        max_mr_size:                    0xffffffffffffffff
        page_size_cap:                  0xfffffe00
        max_qp:                         131008
        max_qp_wr:                      16384
        device_cap_flags:               0x007c9c76
        max_sge:                        32
        max_sge_rd:                     0
        max_cq:                         65408
        max_cqe:                        4194303
        max_mr:                         524272
        max_pd:                         32764
        max_qp_rd_atom:                 16
        max_ee_rd_atom:                 0
        max_res_rd_atom:                2096128
        max_qp_init_rd_atom:            128
        max_ee_init_rd_atom:            0
        atomic_cap:                     ATOMIC_HCA (1)
        max_ee:                         0
        max_rdd:                        0
        max_mw:                         0
        max_raw_ipv6_qp:                0
        max_raw_ethy_qp:                0
        max_mcast_grp:                  8192
        max_mcast_qp_attach:            248
        max_total_mcast_qp_attach:      2031616
        max_ah:                         0
        max_fmr:                        0
        max_srq:                        65472
        max_srq_wr:                     16383
        max_srq_sge:                    31
        max_pkeys:                      128
        local_ca_ack_delay:             15
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                4096 (5)
                        active_mtu:             4096 (5)
                        sm_lid:                 1
                        port_lid:               1
                        port_lmc:               0x00
                        link_layer:             InfiniBand
                        max_msg_sz:             0x40000000
                        port_cap_flags:         0x0251086a
                        max_vl_num:             4 (3)
                        bad_pkey_cntr:          0x0
                        qkey_viol_cntr:         0x0
                        sm_sl:                  0
                        pkey_tbl_len:           128
                        gid_tbl_len:            128
                        subnet_timeout:         18
                        init_type_reply:        0
                        active_width:           4X (2)
                        active_speed:           10.0 Gbps (4)
                        phys_state:             LINK_UP (5)
                        GID[  0]:               fe80:0000:0000:0000:0002:c903:0005:f34f

                port:   2
                        state:                  PORT_INIT (2)
                        max_mtu:                4096 (5)
                        active_mtu:             4096 (5)
                        sm_lid:                 0
                        port_lid:               0
                        port_lmc:               0x00
                        link_layer:             InfiniBand
                        max_msg_sz:             0x40000000
                        port_cap_flags:         0x02510868
                        max_vl_num:             4 (3)
                        bad_pkey_cntr:          0x0
                        qkey_viol_cntr:         0x0
                        sm_sl:                  0
                        pkey_tbl_len:           128
                        gid_tbl_len:            128
                        subnet_timeout:         0
                        init_type_reply:        0
                        active_width:           4X (2)
                        active_speed:           10.0 Gbps (4)
                        phys_state:             LINK_UP (5)
                        GID[  0]:               fe80:0000:0000:0000:0002:c903:0005:f350

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: mlx4: kernel 3.4-rc1 breaks libumad
       [not found]                 ` <4F799222.3050306-HInyCGIudOg@public.gmane.org>
@ 2012-04-02 12:51                   ` Or Gerlitz
       [not found]                     ` <4F79A0C5.2030805-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Or Gerlitz @ 2012-04-02 12:51 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 4/2/2012 2:48 PM, Bart Van Assche wrote:
> The two ports are connected back-to-back to another mlx4 HCA. I 
> noticed this behavior change since opensm stopped working after 
> rebooting into 3.4-rc1.

can you add these prints and send me the output after attempting to cat 
the rate file?

Or.

> diff --git a/drivers/infiniband/core/sysfs.c 
> b/drivers/infiniband/core/sysfs.c
> index 83b720e..d20e4a4 100644
> --- a/drivers/infiniband/core/sysfs.c
> +++ b/drivers/infiniband/core/sysfs.c
> @@ -181,8 +181,13 @@ static ssize_t rate_show(struct ib_port *p, 
> struct port_attribute *unused,
>         char *speed = "";
>         int rate = -1;          /* in deci-Gb/sec */
>         ssize_t ret;
> +       enum rdma_link_layer ll;
>
>         ret = ib_query_port(p->ibdev, p->port_num, &attr);
> +
> +       ll = rdma_port_get_link_layer(p->ibdev, p->port_num);
> +       printk(KERN_ERR "%s ret %d for ib_query_port dev %s port %d 
> link %d\n",
> +               __func__, ret, p->ibdev->name, p->port_num, ll);
>         if (ret)
>                 return ret;
>
> diff --git a/drivers/infiniband/hw/mlx4/main.c 
> b/drivers/infiniband/hw/mlx4/main.c
> index 75d3056..26b67c6 100644
> --- a/drivers/infiniband/hw/mlx4/main.c
> +++ b/drivers/infiniband/hw/mlx4/main.c
> @@ -256,6 +256,7 @@ static int ib_link_query_port(struct ib_device 
> *ibdev, u8 port,
>  out:
>         kfree(in_mad);
>         kfree(out_mad);
> +       printk(KERN_ERR "%s active_speed %d\n", __func__, 
> props->active_speed);
>         return err;
>  }
>
> @@ -312,6 +313,7 @@ out_unlock:
>         spin_unlock(&iboe->lock);
>  out:
>         mlx4_free_cmd_mailbox(mdev->dev, mailbox);
> +       printk(KERN_ERR "%s active_speed %d\n", __func__, 
> props->active_speed);
>         return err;
>  }

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: mlx4: kernel 3.4-rc1 breaks libumad
       [not found]                     ` <4F79A0C5.2030805-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2012-04-02 13:02                       ` Or Gerlitz
       [not found]                         ` <4F79A359.2020204-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2012-04-02 13:35                       ` Bart Van Assche
  1 sibling, 1 reply; 13+ messages in thread
From: Or Gerlitz @ 2012-04-02 13:02 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 4/2/2012 3:51 PM, Or Gerlitz wrote:
> can you add these prints and send me the output after attempting to 
> cat the rate file?

okay, on a system which has IB on port 1 and Ethernet on port 2, using 
this patch
I get these prints:
> ib_link_query_port active_speed 4
> rate_show ret 0 for ib_query_port dev mlx4_0 port 1 link 1
> eth_link_query_port active_speed 4
> rate_show ret 0 for ib_query_port dev mlx4_0 port 2 link 2

but if forcing port 2 link layer to be IB as well, which means we will 
land in ib_link_query_port for an Ethernet port, I get the below

> echo ib >  /sys/bus/pci/devices/0000:07:00.0/mlx4_port2
> ib_link_query_port active_speed 4
> rate_show ret 0 for ib_query_port dev mlx4_0 port 1 link 1
> ib_link_query_port active_speed 7
> rate_show ret 0 for ib_query_port dev mlx4_0 port 2 link 1

So when doing the MAD_IFC port info query command on Ethernet port, the 
firmware returns the
value of seven which isn't among the IB speeds and we are remained with 
rate=-1 in rate_show
of drivers/infiniband/core/sysfs.c

It should be pretty simple to come with patch to that situation, but I 
want to better understand
what happens on your system, waiting for the output...

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: mlx4: kernel 3.4-rc1 breaks libumad
       [not found]                         ` <4F79A359.2020204-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
@ 2012-04-02 13:25                           ` Hal Rosenstock
       [not found]                             ` <4F79A8C4.5000604-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  0 siblings, 1 reply; 13+ messages in thread
From: Hal Rosenstock @ 2012-04-02 13:25 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Bart Van Assche, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Alex Netes,
	Ira Weiny

On 4/2/2012 9:02 AM, Or Gerlitz wrote:
> On 4/2/2012 3:51 PM, Or Gerlitz wrote:
>> can you add these prints and send me the output after attempting to
>> cat the rate file?
> 
> okay, on a system which has IB on port 1 and Ethernet on port 2, using
> this patch
> I get these prints:
>> ib_link_query_port active_speed 4
>> rate_show ret 0 for ib_query_port dev mlx4_0 port 1 link 1
>> eth_link_query_port active_speed 4
>> rate_show ret 0 for ib_query_port dev mlx4_0 port 2 link 2
> 
> but if forcing port 2 link layer to be IB as well, which means we will
> land in ib_link_query_port for an Ethernet port, I get the below
> 
>> echo ib >  /sys/bus/pci/devices/0000:07:00.0/mlx4_port2
>> ib_link_query_port active_speed 4
>> rate_show ret 0 for ib_query_port dev mlx4_0 port 1 link 1
>> ib_link_query_port active_speed 7
>> rate_show ret 0 for ib_query_port dev mlx4_0 port 2 link 1
> 
> So when doing the MAD_IFC port info query command on Ethernet port, the
> firmware returns the
> value of seven which isn't among the IB speeds and we are remained with
> rate=-1 in rate_show
> of drivers/infiniband/core/sysfs.c

libibumad (and infiniband-diags) are not yet RoCE ready AFAIK. Fixing
that at least for libibumad is minor. Ira can comment on infiniband-diags.

> It should be pretty simple to come with patch to that situation, but I
> want to better understand
> what happens on your system, waiting for the output...

I think there are 3 main issues here:
1. EINVAL can be returned from rate_show and hence "Invalid argument"
rate string should be handled in libibumad. I think this was Bart's
original point.
2. Why is rate_show returning EINVAL ? I think that's what you're trying
to isolate with the additional printks you sent Bart for sysfs.c.
3. link_layer ethernet should also be handled which is the issue you raised.

-- Hal

> Or.
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: mlx4: kernel 3.4-rc1 breaks libumad
       [not found]                     ` <4F79A0C5.2030805-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
  2012-04-02 13:02                       ` Or Gerlitz
@ 2012-04-02 13:35                       ` Bart Van Assche
       [not found]                         ` <4F79AB24.2090200-HInyCGIudOg@public.gmane.org>
  1 sibling, 1 reply; 13+ messages in thread
From: Bart Van Assche @ 2012-04-02 13:35 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 04/02/12 12:51, Or Gerlitz wrote:

> On 4/2/2012 2:48 PM, Bart Van Assche wrote:
>> The two ports are connected back-to-back to another mlx4 HCA. I
>> noticed this behavior change since opensm stopped working after
>> rebooting into 3.4-rc1.
> 
> can you add these prints and send me the output after attempting to cat
> the rate file?


Some additional info:
- This issue only occurs if the back-to-back connected system is down,
  not if it is running.
- The output I get with the other system down is:

# cat /sys/class/infiniband/mlx4_0/ports/1/link_layer
InfiniBand
# dmesg -c >/dev/null
# cat /sys/class/infiniband/mlx4_0/ports/1/rate
cat: /sys/class/infiniband/mlx4_0/ports/1/rate: Invalid argument
# dmesg -c
ib_link_query_port active_speed 7
rate_show ret 0 for ib_query_port dev mlx4_0 port 1 link 1

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: mlx4: kernel 3.4-rc1 breaks libumad
       [not found]                         ` <4F79AB24.2090200-HInyCGIudOg@public.gmane.org>
@ 2012-04-02 14:06                           ` Or Gerlitz
  0 siblings, 0 replies; 13+ messages in thread
From: Or Gerlitz @ 2012-04-02 14:06 UTC (permalink / raw)
  To: Bart Van Assche, Roland Dreier; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

On 4/2/2012 4:35 PM, Bart Van Assche wrote:
>
> Some additional info:
> - This issue only occurs if the back-to-back connected system is down,
>    not if it is running.
> - The output I get with the other system down is:
>
> # cat /sys/class/infiniband/mlx4_0/ports/1/link_layer
> InfiniBand
> # dmesg -c>/dev/null
> # cat /sys/class/infiniband/mlx4_0/ports/1/rate
> cat: /sys/class/infiniband/mlx4_0/ports/1/rate: Invalid argument
> # dmesg -c
> ib_link_query_port active_speed 7
> rate_show ret 0 for ib_query_port dev mlx4_0 port 1 link 1
>

So you're getting the same wrong value of seven which I get on my 
systems, I think the patch / way to go here would be to assume some 
fixed speed (SDR?) when the link is down - as when the link is really 
down or down since the "wrong" link layer is assumed, the firmware 
command returns that value of seven which isn't in the speed enum, Roland?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: mlx4: kernel 3.4-rc1 breaks libumad
       [not found]                             ` <4F79A8C4.5000604-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
@ 2012-04-02 14:10                               ` Hal Rosenstock
  2012-04-02 14:51                               ` Or Gerlitz
  2012-04-03 21:37                               ` Ira Weiny
  2 siblings, 0 replies; 13+ messages in thread
From: Hal Rosenstock @ 2012-04-02 14:10 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Or Gerlitz, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Alex Netes, Ira Weiny

Bart,

On 4/2/2012 9:25 AM, Hal Rosenstock wrote:
> On 4/2/2012 9:02 AM, Or Gerlitz wrote:
>> On 4/2/2012 3:51 PM, Or Gerlitz wrote:
>>> can you add these prints and send me the output after attempting to
>>> cat the rate file?
>>
>> okay, on a system which has IB on port 1 and Ethernet on port 2, using
>> this patch
>> I get these prints:
>>> ib_link_query_port active_speed 4
>>> rate_show ret 0 for ib_query_port dev mlx4_0 port 1 link 1
>>> eth_link_query_port active_speed 4
>>> rate_show ret 0 for ib_query_port dev mlx4_0 port 2 link 2
>>
>> but if forcing port 2 link layer to be IB as well, which means we will
>> land in ib_link_query_port for an Ethernet port, I get the below
>>
>>> echo ib >  /sys/bus/pci/devices/0000:07:00.0/mlx4_port2
>>> ib_link_query_port active_speed 4
>>> rate_show ret 0 for ib_query_port dev mlx4_0 port 1 link 1
>>> ib_link_query_port active_speed 7
>>> rate_show ret 0 for ib_query_port dev mlx4_0 port 2 link 1
>>
>> So when doing the MAD_IFC port info query command on Ethernet port, the
>> firmware returns the
>> value of seven which isn't among the IB speeds and we are remained with
>> rate=-1 in rate_show
>> of drivers/infiniband/core/sysfs.c
> 
> libibumad (and infiniband-diags) are not yet RoCE ready AFAIK. Fixing
> that at least for libibumad is minor. Ira can comment on infiniband-diags.
> 
>> It should be pretty simple to come with patch to that situation, but I
>> want to better understand
>> what happens on your system, waiting for the output...
> 
> I think there are 3 main issues here:
> 1. EINVAL can be returned from rate_show and hence "Invalid argument"
> rate string should be handled in libibumad. I think this was Bart's
> original point.

Would you please try libibumad patch below ? Thanks.

-- Hal

> 2. Why is rate_show returning EINVAL ? I think that's what you're trying
> to isolate with the additional printks you sent Bart for sysfs.c.
> 3. link_layer ethernet should also be handled which is the issue you raised.
> 
> -- Hal
> 
>> Or.

libbibumad/umad.c: In get_port, handle "invalid" rates

where sysfs rate file contains "Invalid argument"

Signed-off-by: Hal Rosenstock <hal-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
---
diff --git a/src/umad.c b/src/umad.c
index 45a9423..c638ebd 100644
--- a/src/umad.c
+++ b/src/umad.c
@@ -132,6 +132,7 @@ static int get_port(char *ca_name, char *dir, int portnum, umad_port_t * port)
 	uint8_t gid[16];
 	struct dirent **namelist = NULL;
 	int i, len, num_pkeys = 0;
+	char tmp[24];
 
 	strncpy(port->ca_name, ca_name, sizeof port->ca_name - 1);
 	port->portnum = portnum;
@@ -153,8 +154,13 @@ static int get_port(char *ca_name, char *dir, int portnum, umad_port_t * port)
 		goto clean;
 	if (sys_read_uint(port_dir, SYS_PORT_PHY_STATE, &port->phys_state) < 0)
 		goto clean;
-	if (sys_read_uint(port_dir, SYS_PORT_RATE, &port->rate) < 0)
-		goto clean;
+	if (sys_read_uint(port_dir, SYS_PORT_RATE, &port->rate) < 0) {
+		if (sys_read_string(port_dir, SYS_PORT_RATE, tmp,
+				    sizeof(tmp)) < 0)
+			goto clean;
+		if (strcmp(tmp, strerror(EINVAL)))
+			goto clean;
+	}
 	if (sys_read_uint(port_dir, SYS_PORT_CAPMASK, &port->capmask) < 0)
 		goto clean;
 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: mlx4: kernel 3.4-rc1 breaks libumad
       [not found]                             ` <4F79A8C4.5000604-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  2012-04-02 14:10                               ` Hal Rosenstock
@ 2012-04-02 14:51                               ` Or Gerlitz
  2012-04-03 21:37                               ` Ira Weiny
  2 siblings, 0 replies; 13+ messages in thread
From: Or Gerlitz @ 2012-04-02 14:51 UTC (permalink / raw)
  To: Hal Rosenstock
  Cc: Bart Van Assche, linux-rdma-u79uwXL29TY76Z2rM5mHXA, Alex Netes,
	Ira Weiny

On 4/2/2012 4:25 PM, Hal Rosenstock wrote:
>
> I think there are 3 main issues here:
> 1. EINVAL can be returned from rate_show and hence "Invalid argument"
> rate string should be handled in libibumad. I think this was Bart's original point.
> 2. Why is rate_show returning EINVAL ? I think that's what you're trying
> to isolate with the additional printks you sent Bart for sysfs.c.
> 3. link_layer ethernet should also be handled which is the issue you raised.
Just to sync, I just sent kernel patches which should eliminate the case 
when EINVAL is seen by rate_show after calling into the mlx4 driver.  
This addresses point #2, as for #3, I made a check and forced Ethernet 
link layer on IB port, in this case the mlx4 driver issues the ethernet 
port query command and it returns valid value for active_speed (4) so I 
didn't patch that code too. As for libibumad fixes, I'm not dealing with 
that...

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: mlx4: kernel 3.4-rc1 breaks libumad
       [not found]                             ` <4F79A8C4.5000604-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
  2012-04-02 14:10                               ` Hal Rosenstock
  2012-04-02 14:51                               ` Or Gerlitz
@ 2012-04-03 21:37                               ` Ira Weiny
  2 siblings, 0 replies; 13+ messages in thread
From: Ira Weiny @ 2012-04-03 21:37 UTC (permalink / raw)
  To: Hal Rosenstock
  Cc: Or Gerlitz, Bart Van Assche, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	Alex Netes

On Mon, 02 Apr 2012 09:25:24 -0400
Hal Rosenstock <hal-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:

> On 4/2/2012 9:02 AM, Or Gerlitz wrote:
> > On 4/2/2012 3:51 PM, Or Gerlitz wrote:
> >> can you add these prints and send me the output after attempting to
> >> cat the rate file?
> > 
> > okay, on a system which has IB on port 1 and Ethernet on port 2, using
> > this patch
> > I get these prints:
> >> ib_link_query_port active_speed 4
> >> rate_show ret 0 for ib_query_port dev mlx4_0 port 1 link 1
> >> eth_link_query_port active_speed 4
> >> rate_show ret 0 for ib_query_port dev mlx4_0 port 2 link 2
> > 
> > but if forcing port 2 link layer to be IB as well, which means we will
> > land in ib_link_query_port for an Ethernet port, I get the below
> > 
> >> echo ib >  /sys/bus/pci/devices/0000:07:00.0/mlx4_port2
> >> ib_link_query_port active_speed 4
> >> rate_show ret 0 for ib_query_port dev mlx4_0 port 1 link 1
> >> ib_link_query_port active_speed 7
> >> rate_show ret 0 for ib_query_port dev mlx4_0 port 2 link 1
> > 
> > So when doing the MAD_IFC port info query command on Ethernet port, the
> > firmware returns the
> > value of seven which isn't among the IB speeds and we are remained with
> > rate=-1 in rate_show
> > of drivers/infiniband/core/sysfs.c
> 
> libibumad (and infiniband-diags) are not yet RoCE ready AFAIK. Fixing
> that at least for libibumad is minor. Ira can comment on infiniband-diags.

I agree they are not "RoCE ready".  But the main reason is I am unclear what "RoCE ready" means.  My first thought is that "InfiniBand" Diags should not function on an Ethernet link.  However, we seem to be merging much of the functionality and it does not seem to hurt in most cases.

If some of the diags do retain functionality on an Ethernet link then perhaps some name changes are in order in addition to testing.  For example "ibstat" should probably be "rdmastat" or something.  (This change was made to the perftest package a long time ago.)

I guess my question to the hardware vendors is:

What MAD's, __if__ any, do you see Ethernet supporting in the future?  Do you see MADs being used in some Open Flow spec to be able to program switches?  What about Performance Management?

I don't want to get all draconian and remove these devices, since having more information (ie from ibstat) is good.  But other than that tool what else should the diags support?

Ira

> 
> > It should be pretty simple to come with patch to that situation, but I
> > want to better understand
> > what happens on your system, waiting for the output...
> 
> I think there are 3 main issues here:
> 1. EINVAL can be returned from rate_show and hence "Invalid argument"
> rate string should be handled in libibumad. I think this was Bart's
> original point.
> 2. Why is rate_show returning EINVAL ? I think that's what you're trying
> to isolate with the additional printks you sent Bart for sysfs.c.
> 3. link_layer ethernet should also be handled which is the issue you raised.
> 
> -- Hal
> 
> > Or.
> > -- 
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 


-- 
Ira Weiny
Member of Technical Staff
Lawrence Livermore National Lab
925-423-8008
weiny2-i2BcT+NCU+M@public.gmane.org
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-04-03 21:37 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-02  7:42 mlx4: kernel 3.4-rc1 breaks libumad Bart Van Assche
     [not found] ` <4F795880.4070306-HInyCGIudOg@public.gmane.org>
2012-04-02 10:33   ` Or Gerlitz
     [not found]     ` <4F798069.4030305-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2012-04-02 11:16       ` Bart Van Assche
     [not found]         ` <4F798A9B.7060805-HInyCGIudOg@public.gmane.org>
2012-04-02 11:20           ` Or Gerlitz
     [not found]             ` <4F798B9A.6090309-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2012-04-02 11:48               ` Bart Van Assche
     [not found]                 ` <4F799222.3050306-HInyCGIudOg@public.gmane.org>
2012-04-02 12:51                   ` Or Gerlitz
     [not found]                     ` <4F79A0C5.2030805-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2012-04-02 13:02                       ` Or Gerlitz
     [not found]                         ` <4F79A359.2020204-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2012-04-02 13:25                           ` Hal Rosenstock
     [not found]                             ` <4F79A8C4.5000604-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2012-04-02 14:10                               ` Hal Rosenstock
2012-04-02 14:51                               ` Or Gerlitz
2012-04-03 21:37                               ` Ira Weiny
2012-04-02 13:35                       ` Bart Van Assche
     [not found]                         ` <4F79AB24.2090200-HInyCGIudOg@public.gmane.org>
2012-04-02 14:06                           ` Or Gerlitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.