All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mlx4: Fix unneeded return error in eth_link_query_port
@ 2010-10-24  7:58 Eli Cohen
  2010-10-24 15:42 ` Roland Dreier
  0 siblings, 1 reply; 29+ messages in thread
From: Eli Cohen @ 2010-10-24  7:58 UTC (permalink / raw)
  To: Roland Dreier; +Cc: RDMA list

eth_link_query_port() returns error if a netdevice was not yet associated with
the IBoE port. This is not required since we already initialize the link as
down.  On the other hand, we need other information that the query provides.
Specifically, this can cause a failure to initilize an IBoE device after this
commit 5eb620c8, which calls ib_query_port().
Fix this by always returning success.

Signed-off-by: Eli Cohen <eli-VPRAkNaXOzVS1MOuV/RT9w@public.gmane.org>
---
 drivers/infiniband/hw/mlx4/main.c |    7 ++-----
 1 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 30cd111..bf3e20c 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -217,7 +217,6 @@ static int eth_link_query_port(struct ib_device *ibdev, u8 port,
 {
 	struct mlx4_ib_iboe *iboe = &to_mdev(ibdev)->iboe;
 	struct net_device *ndev;
-	int err = 0;
 	enum ib_mtu tmp;
 
 	props->active_width	= IB_WIDTH_4X;
@@ -237,10 +236,8 @@ static int eth_link_query_port(struct ib_device *ibdev, u8 port,
 	props->active_mtu	= IB_MTU_256;
 	spin_lock(&iboe->lock);
 	ndev = iboe->netdevs[port - 1];
-	if (!ndev) {
-		err = -ENOMEM;
+	if (!ndev)
 		goto out;
-	}
 
 	tmp = iboe_get_mtu(ndev->mtu);
 	props->active_mtu = tmp ? min(props->max_mtu, tmp) : IB_MTU_256;
@@ -251,7 +248,7 @@ static int eth_link_query_port(struct ib_device *ibdev, u8 port,
 
 out:
 	spin_unlock(&iboe->lock);
-	return err;
+	return 0;
 }
 
 static int mlx4_ib_query_port(struct ib_device *ibdev, u8 port,
-- 
1.7.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH] mlx4: Fix unneeded return error in eth_link_query_port
  2010-10-24  7:58 [PATCH] mlx4: Fix unneeded return error in eth_link_query_port Eli Cohen
@ 2010-10-24 15:42 ` Roland Dreier
       [not found]   ` <adahbgbppgx.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: Roland Dreier @ 2010-10-24 15:42 UTC (permalink / raw)
  To: Eli Cohen; +Cc: RDMA list

applied, thanks.

(I didn't introduce this bug in my monkeying with the patches, did I?)
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mlx4: Fix unneeded return error in eth_link_query_port
       [not found]   ` <adahbgbppgx.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
@ 2010-10-24 16:00     ` Eli Cohen
  2010-10-24 16:22       ` Roland Dreier
  0 siblings, 1 reply; 29+ messages in thread
From: Eli Cohen @ 2010-10-24 16:00 UTC (permalink / raw)
  To: Roland Dreier; +Cc: RDMA list

On Sun, Oct 24, 2010 at 08:42:06AM -0700, Roland Dreier wrote:
> applied, thanks.
> 
> (I didn't introduce this bug in my monkeying with the patches, did I?)

No you did not. It was there already but we never noticed before
Yossi's patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mlx4: Fix unneeded return error in eth_link_query_port
  2010-10-24 16:00     ` Eli Cohen
@ 2010-10-24 16:22       ` Roland Dreier
       [not found]         ` <adaocajo90s.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: Roland Dreier @ 2010-10-24 16:22 UTC (permalink / raw)
  To: Eli Cohen; +Cc: RDMA list

 > No you did not. It was there already but we never noticed before
 > Yossi's patch.

But AFAICT Yossi's patch (5eb620c8) went into 2.6.22 about 2.5 years
ago... wasn't that already there way before the IBoE stuff started?

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mlx4: Fix unneeded return error in eth_link_query_port
       [not found]         ` <adaocajo90s.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
@ 2010-10-24 18:05           ` Eli Cohen
       [not found]             ` <AANLkTimb++kFYFXCWajBGACpA1OpCXyyeyD-98Ed3uTu-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: Eli Cohen @ 2010-10-24 18:05 UTC (permalink / raw)
  To: Roland Dreier; +Cc: RDMA list

On Sun, Oct 24, 2010 at 6:22 PM, Roland Dreier <rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> wrote:
>  > No you did not. It was there already but we never noticed before
>  > Yossi's patch.
>
> But AFAICT Yossi's patch (5eb620c8) went into 2.6.22 about 2.5 years
> ago... wasn't that already there way before the IBoE stuff started?
>

I see... I think the reason it started failing comes from this portion
of patch 8:

+       mlx4_ib_port_link_layer(ibdev, port) == IB_LINK_LAYER_INFINIBAND ?
+               ib_link_query_port(ibdev, port, props, out_mad) :
+               eth_link_query_port(ibdev, port, props, out_mad);

I discarded the value returned and you added it as it should be.
So the fix is valid anyway, but the changelog has to be changed,
probably to something like:

eth_link_query_port() returns error if a netdevice was not yet associated with
the IBoE port. This is not required since we already initialize the link as
down.  Failure to do so can cause mlx4_ib loading to fail.
Fix this by always returning success.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mlx4: Fix unneeded return error in eth_link_query_port
       [not found]             ` <AANLkTimb++kFYFXCWajBGACpA1OpCXyyeyD-98Ed3uTu-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-10-25  4:12               ` Roland Dreier
       [not found]                 ` <adad3qync5m.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
  2010-10-25  9:26               ` Or Gerlitz
  1 sibling, 1 reply; 29+ messages in thread
From: Roland Dreier @ 2010-10-25  4:12 UTC (permalink / raw)
  To: Eli Cohen; +Cc: RDMA list

OK... since my "fix" for the missing return of the error value ended up
breaking things, I just decided to fold this second fix into the
original patch.  The end result is the same code but I didn't think
keeping the buggy point in the middle of history helped anything.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mlx4: Fix unneeded return error in eth_link_query_port
       [not found]                 ` <adad3qync5m.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
@ 2010-10-25  8:18                   ` Eli Cohen
  0 siblings, 0 replies; 29+ messages in thread
From: Eli Cohen @ 2010-10-25  8:18 UTC (permalink / raw)
  To: Roland Dreier; +Cc: RDMA list

On Sun, Oct 24, 2010 at 09:12:37PM -0700, Roland Dreier wrote:
> OK... since my "fix" for the missing return of the error value ended up
> breaking things, I just decided to fold this second fix into the
> original patch.  The end result is the same code but I didn't think
> keeping the buggy point in the middle of history helped anything.
> 
Sure, makes sense.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mlx4: Fix unneeded return error in eth_link_query_port
       [not found]             ` <AANLkTimb++kFYFXCWajBGACpA1OpCXyyeyD-98Ed3uTu-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2010-10-25  4:12               ` Roland Dreier
@ 2010-10-25  9:26               ` Or Gerlitz
       [not found]                 ` <4CC54D4E.7050203-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
  1 sibling, 1 reply; 29+ messages in thread
From: Or Gerlitz @ 2010-10-25  9:26 UTC (permalink / raw)
  To: Eli Cohen, Roland Dreier; +Cc: RDMA list

Eli Cohen wrote:
> On Sun, Oct 24, 2010 at 6:22 PM, Roland Dreier <rdreier-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org> wrote:
>>  > No you did not. It was there already but we never noticed before Yossi's patch.
>> But AFAICT Yossi's patch (5eb620c8) went into 2.6.22 about 2.5 years
>> ago... wasn't that already there way before the IBoE stuff started?

> I see... I think the reason it started failing comes from this portion of patch 8:

I pulled/built/booted with the for-next branch of Roland's tree, and I can't get IB link for the node, 
I don't think this is my problem, since I'm on L2 IB and not Eth, but should this work with pre 2.7 
firmware?! if not, maybe patch the mlx4 driver to print some error,

> # ibv_devinfo
> hca_id: mlx4_0
>         transport:                      InfiniBand (0)
>         fw_ver:                         2.6.818
>         node_guid:                      0002:c903:0002:6be2
>         sys_image_guid:                 0002:c903:0002:6be5
>         vendor_id:                      0x02c9
>         vendor_part_id:                 26418
>         hw_ver:                         0xA0
>         board_id:                       MT_0A50110002
>         phys_port_cnt:                  2
>                 port:   1
>                         state:                  PORT_INIT (2)
>                         max_mtu:                2048 (4)
>                         active_mtu:             2048 (4)
>                         sm_lid:                 0
>                         port_lid:               0
>                         port_lmc:               0x00
> 
>                 port:   2
>                         state:                  PORT_INIT (2)
>                         max_mtu:                2048 (4)
>                         active_mtu:             2048 (4)
>                         sm_lid:                 0
>                         port_lid:               0
>                         port_lmc:               0x00

> # dmesg
> mlx4_core: Mellanox ConnectX core driver v0.01 (May 1, 2007)
> mlx4_core: Initializing 0000:0b:00.0
> mlx4_core 0000:0b:00.0: PCI INT A -> GSI 30 (level, low) -> IRQ 30
> mlx4_core 0000:0b:00.0: setting latency timer to 64
> mlx4_core 0000:0b:00.0: FW version 2.6.818 (cmd intf rev 3), max commands 16
> mlx4_core 0000:0b:00.0: Catastrophic error buffer at 0x1f020, size 0x10, BAR 0
> mlx4_core 0000:0b:00.0: FW size 385 KB
> mlx4_core 0000:0b:00.0: Clear int @ f0058, BAR 0
> mlx4_core 0000:0b:00.0: Mapped 26 chunks/6168 KB for FW.
> mlx4_core 0000:0b:00.0: BlueFlame available (reg size 512, regs/page 256)
> mlx4_core 0000:0b:00.0: Base MM extensions: flags 00000cc0, rsvd L_Key 00000500
> mlx4_core 0000:0b:00.0: Max ICM size 4294967296 MB
> mlx4_core 0000:0b:00.0: Max QPs: 16777216, reserved QPs: 64, entry size: 256
> mlx4_core 0000:0b:00.0: Max SRQs: 16777216, reserved SRQs: 64, entry size: 128
> mlx4_core 0000:0b:00.0: Max CQs: 16777216, reserved CQs: 128, entry size: 128
> mlx4_core 0000:0b:00.0: Max EQs: 512, reserved EQs: 4, entry size: 128
> mlx4_core 0000:0b:00.0: reserved MPTs: 16, reserved MTTs: 16
> mlx4_core 0000:0b:00.0: Max PDs: 8388608, reserved PDs: 4, reserved UARs: 1
> mlx4_core 0000:0b:00.0: Max QP/MCG: 8388608, reserved MGMs: 0
> mlx4_core 0000:0b:00.0: Max CQEs: 4194304, max WQEs: 16384, max SRQ WQEs: 16384
> mlx4_core 0000:0b:00.0: Local CA ACK delay: 15, max MTU: 4096, port width cap: 3
> mlx4_core 0000:0b:00.0: Max SQ desc size: 1008, max SQ S/G: 62
> mlx4_core 0000:0b:00.0: Max RQ desc size: 512, max RQ S/G: 32
> mlx4_core 0000:0b:00.0: Max GSO size: 131072
> mlx4_core 0000:0b:00.0: DEV_CAP flags:
> mlx4_core 0000:0b:00.0:     RC transport
> mlx4_core 0000:0b:00.0:     UC transport
> mlx4_core 0000:0b:00.0:     UD transport
> mlx4_core 0000:0b:00.0:     XRC transport
> mlx4_core 0000:0b:00.0:     FCoIB support
> mlx4_core 0000:0b:00.0:     SRQ support
> mlx4_core 0000:0b:00.0:     IPoIB checksum offload
> mlx4_core 0000:0b:00.0:     P_Key violation counter
> mlx4_core 0000:0b:00.0:     Q_Key violation counter
> mlx4_core 0000:0b:00.0:     Big LSO headers
> mlx4_core 0000:0b:00.0:     APM support
> mlx4_core 0000:0b:00.0:     Atomic ops support
> mlx4_core 0000:0b:00.0:     Address vector port checking support
> mlx4_core 0000:0b:00.0:     UD multicast support
> mlx4_core 0000:0b:00.0:     Router support
> mlx4_core 0000:0b:00.0:     IBoE support
> mlx4_core 0000:0b:00.0:   profile[ 0] (  CMPT): 2^26 entries @ 0x         0, size 0x 100000000
> mlx4_core 0000:0b:00.0:   profile[ 1] (RDMARC): 2^21 entries @ 0x 100000000, size 0x   4000000
> mlx4_core 0000:0b:00.0:   profile[ 2] (   MTT): 2^20 entries @ 0x 104000000, size 0x   4000000
> mlx4_core 0000:0b:00.0:   profile[ 3] (    QP): 2^17 entries @ 0x 108000000, size 0x   2000000
> mlx4_core 0000:0b:00.0:   profile[ 4] (  ALTC): 2^17 entries @ 0x 10a000000, size 0x    800000
> mlx4_core 0000:0b:00.0:   profile[ 5] (   SRQ): 2^16 entries @ 0x 10a800000, size 0x    800000
> mlx4_core 0000:0b:00.0:   profile[ 6] (    CQ): 2^16 entries @ 0x 10b000000, size 0x    800000
> mlx4_core 0000:0b:00.0:   profile[ 7] (  DMPT): 2^17 entries @ 0x 10b800000, size 0x    800000
> mlx4_core 0000:0b:00.0:   profile[ 8] (   MCG): 2^13 entries @ 0x 10c000000, size 0x    200000
> mlx4_core 0000:0b:00.0:   profile[ 9] (  AUXC): 2^17 entries @ 0x 10c200000, size 0x     20000
> mlx4_core 0000:0b:00.0:   profile[10] (    EQ): 2^04 entries @ 0x 10c220000, size 0x      1000
> mlx4_core 0000:0b:00.0: HCA context memory: reserving 4393092 KB
> mlx4_core 0000:0b:00.0: 4393092 KB of HCA context requires 8620 KB aux memory.
> mlx4_core 0000:0b:00.0: Mapped 37 chunks/8620 KB for ICM aux.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/256 KB at 0 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/256 KB at 40000000 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/256 KB at 80000000 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/4 KB at c0000000 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/4 KB at 10c220000 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/256 KB at 104000000 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/256 KB at 10b800000 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/256 KB at 108000000 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/128 KB at 10c200000 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/256 KB at 10a000000 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/256 KB at 100000000 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/256 KB at 10b000000 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/256 KB at 10a800000 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/256 KB at 10c000000 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/256 KB at 10c040000 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/256 KB at 10c080000 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/256 KB at 10c0c0000 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/256 KB at 10c100000 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/256 KB at 10c140000 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/256 KB at 10c180000 for ICM.
> mlx4_core 0000:0b:00.0: Mapped 1 chunks/256 KB at 10c1c0000 for ICM.
> mlx4_core 0000:0b:00.0: irq 82 for MSI/MSI-X
> mlx4_core 0000:0b:00.0: irq 83 for MSI/MSI-X
> mlx4_core 0000:0b:00.0: irq 84 for MSI/MSI-X
> mlx4_core 0000:0b:00.0: irq 85 for MSI/MSI-X
> mlx4_core 0000:0b:00.0: irq 86 for MSI/MSI-X
> mlx4_core 0000:0b:00.0: NOP command IRQ test passed
> mlx4_en: Mellanox ConnectX HCA Ethernet driver v1.4.1.1 (June 2009)
> mlx4_ib: Mellanox ConnectX InfiniBand driver v1.0 (April 4, 2008)
> ADDRCONF(NETDEV_UP): ib0: link is not ready
> ADDRCONF(NETDEV_UP): ib1: link is not ready
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mlx4: Fix unneeded return error in eth_link_query_port
       [not found]                 ` <4CC54D4E.7050203-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
@ 2010-10-25  9:33                   ` Or Gerlitz
       [not found]                     ` <4CC54ED7.6030303-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
  2010-10-25 10:47                   ` can't get IB link with the for-next branch / IBoE patches (was "mlx4: Fix unneeded return error...") Or Gerlitz
                                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 29+ messages in thread
From: Or Gerlitz @ 2010-10-25  9:33 UTC (permalink / raw)
  To: Eli Cohen; +Cc: Roland Dreier, RDMA list

> I pulled/built/booted with the for-next branch of Roland's tree, and I can't get IB link for the node, 
> I don't think this is my problem, since I'm on L2 IB and not Eth, but should this work with pre 2.7 
> firmware?! if not, maybe patch the mlx4 driver to print some error,

Oh, I also see an error printed by some script provided by the RHEL5 libmlx4 package, 

> # modprobe mlx4_ib
> awk: /etc/ofed/setup-mlx4.awk:6: (FILENAME=/etc/ofed/mlx4.conf FNR=21) fatal: cannot open file `/sbin/setup-mlx4' for reading (No such file or directory)

> # rpm -qf /etc/ofed/setup-mlx4.awk
> libmlx4-1.0.1-5.el5


> # rpm -ql libmlx4-1.0.1-5.el5
> /etc/libibverbs.d/mlx4.driver
> /etc/modprobe.d/libmlx4.conf
> /etc/ofed/mlx4.conf
> /etc/ofed/setup-mlx4.awk
> /usr/lib64/libmlx4-rdmav2.so
> /usr/lib64/libmlx4.so
> /usr/share/doc/libmlx4-1.0.1
> /usr/share/doc/libmlx4-1.0.1/AUTHORS
> /usr/share/doc/libmlx4-1.0.1/COPYING
> /usr/share/doc/libmlx4-1.0.1/README
> /etc/libibverbs.d/mlx4.driver
> /etc/modprobe.d/libmlx4.conf
> /etc/ofed/mlx4.conf
> /etc/ofed/setup-mlx4.awk
> /usr/lib/libmlx4-rdmav2.so
> /usr/lib/libmlx4.so
> /usr/share/doc/libmlx4-1.0.1
> /usr/share/doc/libmlx4-1.0.1/AUTHORS
> /usr/share/doc/libmlx4-1.0.1/COPYING
> /usr/share/doc/libmlx4-1.0.1/README
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: can't get IB link with the for-next branch / IBoE patches (was "mlx4: Fix unneeded return error...")
       [not found]                 ` <4CC54D4E.7050203-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
  2010-10-25  9:33                   ` Or Gerlitz
@ 2010-10-25 10:47                   ` Or Gerlitz
       [not found]                     ` <4CC5604D.2080803-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
  2010-10-25 10:47                   ` can't get IB link with the for-next branch / IBoE patches (was "mlx4: Fix unneeded return error...") Or Gerlitz
  2010-10-25 11:34                   ` [PATCH] mlx4: Fix unneeded return error in eth_link_query_port Eli Cohen
  3 siblings, 1 reply; 29+ messages in thread
From: Or Gerlitz @ 2010-10-25 10:47 UTC (permalink / raw)
  To: Eli Cohen, Roland Dreier; +Cc: RDMA list

> I pulled/built/booted with the for-next branch of Roland's tree, and I can't get IB link for the node, 
> I don't think this is my problem, since I'm on L2 IB and not Eth, but should this work with pre 2.7 
> firmware?! if not, maybe patch the mlx4 driver to print some error,

okay, I verified that with 2.6.36 this node gets IB link and IPoIB is working fine, so it must be something in or related to the for-next branch, I assume around the IBoE patches that touch mlx4 which make this failure to happen. With 2.6.36 I also see the "awk: /etc/ofed/setup-mlx4.awk:6: (FILENAME=/etc/ofed/mlx4.conf FNR=21) fatal: cannot open file `/sbin/setup-mlx4' for reading (No such file or directory)" warning when loading mlx4_ib, but it doesn't disruptive in the sense that the node works fine, IB wise. 


Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: can't get IB link with the for-next branch / IBoE patches (was "mlx4: Fix unneeded return error...")
       [not found]                 ` <4CC54D4E.7050203-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
  2010-10-25  9:33                   ` Or Gerlitz
  2010-10-25 10:47                   ` can't get IB link with the for-next branch / IBoE patches (was "mlx4: Fix unneeded return error...") Or Gerlitz
@ 2010-10-25 10:47                   ` Or Gerlitz
  2010-10-25 11:34                   ` [PATCH] mlx4: Fix unneeded return error in eth_link_query_port Eli Cohen
  3 siblings, 0 replies; 29+ messages in thread
From: Or Gerlitz @ 2010-10-25 10:47 UTC (permalink / raw)
  To: Eli Cohen, Roland Dreier; +Cc: RDMA list

> I pulled/built/booted with the for-next branch of Roland's tree, and I can't get IB link for the node, 
> I don't think this is my problem, since I'm on L2 IB and not Eth, but should this work with pre 2.7 
> firmware?! if not, maybe patch the mlx4 driver to print some error,

okay, I verified that with 2.6.36 this node gets IB link and IPoIB is working fine, so it must be something in or related to the for-next branch, I assume around the IBoE patches that touch mlx4 which make this failure to happen. With 2.6.36 I also see the "awk: /etc/ofed/setup-mlx4.awk:6: (FILENAME=/etc/ofed/mlx4.conf FNR=21) fatal: cannot open file `/sbin/setup-mlx4' for reading (No such file or directory)" warning when loading mlx4_ib, but it doesn't disruptive in the sense that the node works fine, IB wise. 


Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mlx4: Fix unneeded return error in eth_link_query_port
       [not found]                 ` <4CC54D4E.7050203-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
                                     ` (2 preceding siblings ...)
  2010-10-25 10:47                   ` can't get IB link with the for-next branch / IBoE patches (was "mlx4: Fix unneeded return error...") Or Gerlitz
@ 2010-10-25 11:34                   ` Eli Cohen
  2010-10-25 14:15                     ` Or Gerlitz
  3 siblings, 1 reply; 29+ messages in thread
From: Eli Cohen @ 2010-10-25 11:34 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Roland Dreier, RDMA list

On Mon, Oct 25, 2010 at 11:26:38AM +0200, Or Gerlitz wrote:
> 
> I pulled/built/booted with the for-next branch of Roland's tree, and I can't get IB link for the node, 
> I don't think this is my problem, since I'm on L2 IB and not Eth, but should this work with pre 2.7 
> firmware?! if not, maybe patch the mlx4 driver to print some error,
> 

Hi Or,
IBoE will not work with firmware prior to 2.7.000. I don't think an
error message is required in this case.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mlx4: Fix unneeded return error in eth_link_query_port
       [not found]                     ` <4CC54ED7.6030303-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
@ 2010-10-25 11:37                       ` Eli Cohen
  0 siblings, 0 replies; 29+ messages in thread
From: Eli Cohen @ 2010-10-25 11:37 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Roland Dreier, RDMA list

On Mon, Oct 25, 2010 at 11:33:11AM +0200, Or Gerlitz wrote:
> 
> Oh, I also see an error printed by some script provided by the RHEL5 libmlx4 package, 
> 

You mean you're using for-next on a RHEL5 filesystem?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: can't get IB link with the for-next branch / IBoE patches
       [not found]                     ` <4CC5604D.2080803-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
@ 2010-10-25 13:36                       ` Roland Dreier
       [not found]                         ` <adaaam2jswk.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: Roland Dreier @ 2010-10-25 13:36 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Eli Cohen, RDMA list

 > okay, I verified that with 2.6.36 this node gets IB link and IPoIB is
 > working fine, so it must be something in or related to the for-next
 > branch, I assume around the IBoE patches that touch mlx4 which make
 > this failure to happen. With 2.6.36 I also see the "awk:
 > /etc/ofed/setup-mlx4.awk:6: (FILENAME=/etc/ofed/mlx4.conf FNR=21)
 > fatal: cannot open file `/sbin/setup-mlx4' for reading (No such file
 > or directory)" warning when loading mlx4_ib, but it doesn't
 > disruptive in the sense that the node works fine, IB wise.

I suspect I broke either the UD header packing or the build_mlx_header
function when I "cleaned up" the patches.  I see the same problem, I'll
take a look today.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mlx4: Fix unneeded return error in eth_link_query_port
  2010-10-25 11:34                   ` [PATCH] mlx4: Fix unneeded return error in eth_link_query_port Eli Cohen
@ 2010-10-25 14:15                     ` Or Gerlitz
       [not found]                       ` <AANLkTi=ZxB4b463OOS6YGxTJSKxjyCj8vy0rNtj0n+uA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: Or Gerlitz @ 2010-10-25 14:15 UTC (permalink / raw)
  To: Eli Cohen; +Cc: Or Gerlitz, Roland Dreier, RDMA list

On Mon, Oct 25, 2010 at 1:34 PM, Eli Cohen <eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> IBoE will not work with firmware prior to 2.7.000. I don't think an
> error message is required in this case.

But I'm on **IB** not IBoE, I don't think you mean that the Linux
kernel IB stack is not functional over pre-2.7 firmware with the IBoE
patches?! are you?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mlx4: Fix unneeded return error in eth_link_query_port
       [not found]                       ` <AANLkTi=ZxB4b463OOS6YGxTJSKxjyCj8vy0rNtj0n+uA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-10-25 14:36                         ` Eli Cohen
  2010-10-25 16:46                           ` Or Gerlitz
  0 siblings, 1 reply; 29+ messages in thread
From: Eli Cohen @ 2010-10-25 14:36 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Or Gerlitz, Roland Dreier, RDMA list

On Mon, Oct 25, 2010 at 04:15:47PM +0200, Or Gerlitz wrote:
> On Mon, Oct 25, 2010 at 1:34 PM, Eli Cohen <eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> 
> But I'm on **IB** not IBoE, I don't think you mean that the Linux
> kernel IB stack is not functional over pre-2.7 firmware with the IBoE
> patches?! are you?

Of course not. I just noticed that the IB link for IB link layer does
come up, is that what you're seeing?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: can't get IB link with the for-next branch / IBoE patches
       [not found]                         ` <adaaam2jswk.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
@ 2010-10-25 16:17                           ` Eli Cohen
  2010-10-25 16:45                             ` Or Gerlitz
  2010-10-25 17:23                             ` Roland Dreier
  0 siblings, 2 replies; 29+ messages in thread
From: Eli Cohen @ 2010-10-25 16:17 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Or Gerlitz, RDMA list

On Mon, Oct 25, 2010 at 06:36:43AM -0700, Roland Dreier wrote:
> 
> I suspect I broke either the UD header packing or the build_mlx_header
> function when I "cleaned up" the patches.  I see the same problem, I'll
> take a look today.

I think this will fix things up. The + operator has precedence over
the ? operator so we end up with packet_length equal IB_GRH_BYTES / 4
which is wrong.

diff --git a/drivers/infiniband/core/ud_header.c b/drivers/infiniband/core/ud_header.c
index 7e5d224..bb7e192 100644
--- a/drivers/infiniband/core/ud_header.c
+++ b/drivers/infiniband/core/ud_header.c
@@ -241,7 +241,7 @@ void ib_ud_header_init(int     		    payload_bytes,
 		packet_length = (IB_LRH_BYTES	+
 				 IB_BTH_BYTES	+
 				 IB_DETH_BYTES	+
-				 grh_present ? IB_GRH_BYTES : 0 +
+				 (grh_present ? IB_GRH_BYTES : 0) +
 				 payload_bytes	+
 				 4		+ /* ICRC     */
 				 3) / 4;	  /* round up */
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: can't get IB link with the for-next branch / IBoE patches
  2010-10-25 16:17                           ` Eli Cohen
@ 2010-10-25 16:45                             ` Or Gerlitz
  2010-10-25 17:23                             ` Roland Dreier
  1 sibling, 0 replies; 29+ messages in thread
From: Or Gerlitz @ 2010-10-25 16:45 UTC (permalink / raw)
  To: Eli Cohen, Roland Dreier, Sean Hefty, Steve Wise, Andy Grover
  Cc: Or Gerlitz, RDMA list

On Mon, Oct 25, 2010 at 6:17 PM, Eli Cohen <eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> On Mon, Oct 25, 2010 at 06:36:43AM -0700, Roland Dreier wrote:

>> I suspect I broke either the UD header packing or the build_mlx_header
>> function when I "cleaned up" the patches.  I see the same problem, I'll
>> take a look today.

> I think this will fix things up. The + operator has precedence over
> the ? operator so we end up with packet_length equal IB_GRH_BYTES / 4
> which is wrong.

Once you guys feel to have a fix I would be happy to give Roland's
for-next bits some further basic kernel (e.g IB link up/down, IPoIB,
running SM on a node with IBoE patches) testing and a bit of more
advanced (e.g IB/iSER, IB/RDS [Andy]) testing to see that things are
in place with L2 IB, I would recommend also the iWARP folks to do the
same as the addr/rdma-cm modules were also modified.

The merge window still has about 9 days, so we're okay with delaying
the push in 1-2 days, thoughts people?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mlx4: Fix unneeded return error in eth_link_query_port
  2010-10-25 14:36                         ` Eli Cohen
@ 2010-10-25 16:46                           ` Or Gerlitz
       [not found]                             ` <AANLkTimaEcFZMnYE+G3osTWzPkfxuBpRMtrrXF7xUPYv-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: Or Gerlitz @ 2010-10-25 16:46 UTC (permalink / raw)
  To: Eli Cohen; +Cc: Or Gerlitz, Roland Dreier, RDMA list

On Mon, Oct 25, 2010 at 4:36 PM, Eli Cohen <eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> Of course not. I just noticed that the IB link for IB link layer does
> come up, is that what you're seeing?

No, I didn't have IB Link when I used the for-next bits
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mlx4: Fix unneeded return error in eth_link_query_port
       [not found]                             ` <AANLkTimaEcFZMnYE+G3osTWzPkfxuBpRMtrrXF7xUPYv-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-10-25 17:13                               ` Eli Cohen
  2010-10-25 19:04                                 ` Or Gerlitz
  0 siblings, 1 reply; 29+ messages in thread
From: Eli Cohen @ 2010-10-25 17:13 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Or Gerlitz, Roland Dreier, RDMA list

On Mon, Oct 25, 2010 at 06:46:39PM +0200, Or Gerlitz wrote:
> 
> No, I didn't have IB Link when I used the for-next bits

Can you summarize what is the problem that you're seeing?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: can't get IB link with the for-next branch / IBoE patches
  2010-10-25 16:17                           ` Eli Cohen
  2010-10-25 16:45                             ` Or Gerlitz
@ 2010-10-25 17:23                             ` Roland Dreier
       [not found]                               ` <adalj5mgp96.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
  1 sibling, 1 reply; 29+ messages in thread
From: Roland Dreier @ 2010-10-25 17:23 UTC (permalink / raw)
  To: Eli Cohen; +Cc: Or Gerlitz, RDMA list

 > I think this will fix things up. The + operator has precedence over
 > the ? operator so we end up with packet_length equal IB_GRH_BYTES / 4
 > which is wrong.

Yep, looks like that's where my cleanup broke things.  I rolled this in
and pushed it out; I'm testing it myself now.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: can't get IB link with the for-next branch / IBoE patches
       [not found]                               ` <adalj5mgp96.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
@ 2010-10-25 17:35                                 ` Roland Dreier
       [not found]                                   ` <adahbgagopw.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: Roland Dreier @ 2010-10-25 17:35 UTC (permalink / raw)
  To: Eli Cohen; +Cc: Or Gerlitz, RDMA list

 > Yep, looks like that's where my cleanup broke things.  I rolled this in
 > and pushed it out; I'm testing it myself now.

My IB port comes to active now, I think that fixed things.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mlx4: Fix unneeded return error in eth_link_query_port
  2010-10-25 17:13                               ` Eli Cohen
@ 2010-10-25 19:04                                 ` Or Gerlitz
       [not found]                                   ` <AANLkTik_4OzMLMWXud89m_rF47OQ3Wji9R_Bye+0DcTV-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: Or Gerlitz @ 2010-10-25 19:04 UTC (permalink / raw)
  To: Eli Cohen; +Cc: Or Gerlitz, Roland Dreier, RDMA list

On Mon, Oct 25, 2010 at 7:13 PM, Eli Cohen <eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> On Mon, Oct 25, 2010 at 06:46:39PM +0200, Or Gerlitz wrote:
>> No, I didn't have IB Link when I used the for-next bits
> Can you summarize what is the problem that you're seeing?

Eli, this is pretty simple, I do the following
1. pull/build/boot Roland's for-next
2. modprobe mlx4_ib
--> the port state is INIT forever, is that clear?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mlx4: Fix unneeded return error in eth_link_query_port
       [not found]                                   ` <AANLkTik_4OzMLMWXud89m_rF47OQ3Wji9R_Bye+0DcTV-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-10-25 19:15                                     ` Eli Cohen
       [not found]                                       ` <AANLkTi=yZUoexwVUCfbeGypEWC_8=oZaMu9mBTF+VJgq-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: Eli Cohen @ 2010-10-25 19:15 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Or Gerlitz, Roland Dreier, RDMA list

First, I suggest that you sync your tree again.
Second, I assume your link layer is IB since Ethernet will show either
Down or Active so I think you should be fine after syncing.
If that does not answer you, please don't spare words to explain what
is the problem that you're seeing.

On Mon, Oct 25, 2010 at 9:04 PM, Or Gerlitz <or.gerlitz-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Mon, Oct 25, 2010 at 7:13 PM, Eli Cohen <eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
>> On Mon, Oct 25, 2010 at 06:46:39PM +0200, Or Gerlitz wrote:
>>> No, I didn't have IB Link when I used the for-next bits
>> Can you summarize what is the problem that you're seeing?
>
> Eli, this is pretty simple, I do the following
> 1. pull/build/boot Roland's for-next
> 2. modprobe mlx4_ib
> --> the port state is INIT forever, is that clear?
>
> Or.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] mlx4: Fix unneeded return error in eth_link_query_port
       [not found]                                       ` <AANLkTi=yZUoexwVUCfbeGypEWC_8=oZaMu9mBTF+VJgq-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-10-25 19:38                                         ` Or Gerlitz
  0 siblings, 0 replies; 29+ messages in thread
From: Or Gerlitz @ 2010-10-25 19:38 UTC (permalink / raw)
  To: Eli Cohen; +Cc: Or Gerlitz, Roland Dreier, RDMA list

On Mon, Oct 25, 2010 at 9:15 PM, Eli Cohen <eli-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> wrote:
> First, I suggest that you sync your tree again.

I'll be able to do that tomorrow

> Second, I assume your link layer is IB since

I wrote black-over-white that its IB, what wasn't clear here?

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: can't get IB link with the for-next branch / IBoE patches
       [not found]                                   ` <adahbgagopw.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
@ 2010-10-26  9:33                                     ` Or Gerlitz
       [not found]                                       ` <4CC6A051.3010703-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: Or Gerlitz @ 2010-10-26  9:33 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Eli Cohen, RDMA list

Roland Dreier wrote:
>> Yep, looks like that's where my cleanup broke things.  I rolled this in
>> and pushed it out; I'm testing it myself now.
 
> My IB port comes to active now, I think that fixed things.

same here, I have IB port coming to active and basic IPoIB, opensm working okay
on the node with the current for-next/IBoE bits

Or.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: can't get IB link with the for-next branch / IBoE patches
       [not found]                                       ` <4CC6A051.3010703-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
@ 2010-10-26 12:19                                         ` Or Gerlitz
       [not found]                                           ` <4CC6C75F.8030103-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
  0 siblings, 1 reply; 29+ messages in thread
From: Or Gerlitz @ 2010-10-26 12:19 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Eli Cohen, RDMA list, Andy Grover

> I have IB port coming to active and basic IPoIB, opensm working okay
> on the node with the current for-next/IBoE bits

doing a little bit stress testing, I came across the below oops, when running IPoIB
and couple of iperf/udp sessions, it doesn't look like a problem in the IB stack.

Also with rds, using rds-stress from rds-tools-1.5-1.el5 and "rds-stress -s 192.168.20.18 -p 4000 -t 1 
-q 1K -a 1K -D 1M" on the client side, the node running the for-next/IBoE bits and acting as the passive side
of the test, got hanged. Also here, this could be a bug in RDS and not in the IBoE patches, I know that the rds guys queued about a hundred! patches for 2.6.37 so with these patches things might be better. I have the oops trace in jpg, will send to Andy, Roland and Eli. I guess we can continue these tests for 2-3 days and have the push over the weekend, or push it before and get fixes if needed through the -rc cycle.


Oct 26 12:36:30 nsg2 kernel: BUG: spinlock bad magic on CPU#0, iperf/20845
Oct 26 12:36:30 nsg2 kernel:  lock: ffffffff81663ef8, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
Oct 26 12:36:30 nsg2 kernel: Pid: 20845, comm: iperf Not tainted 2.6.36-rc5-42052-gce806e1 #1
Oct 26 12:36:30 nsg2 kernel: Call Trace:
Oct 26 12:36:30 nsg2 kernel:  [<ffffffff811542b7>] ? do_raw_spin_lock+0x22/0x122
Oct 26 12:36:30 nsg2 kernel:  [<ffffffff81268b2b>] ? dev_queue_xmit+0x10d/0x346
Oct 26 12:36:30 nsg2 kernel:  [<ffffffff8128ca13>] ? ip_push_pending_frames+0x2bf/0x318
Oct 26 12:36:30 nsg2 kernel:  [<ffffffff812a7e66>] ? udp_push_pending_frames+0x2d2/0x351
Oct 26 12:36:30 nsg2 kernel:  [<ffffffff812a970c>] ? udp_sendmsg+0x4b0/0x59c
Oct 26 12:36:30 nsg2 kernel:  [<ffffffff8112e9f7>] ? cap_socket_sendmsg+0x0/0x3
Oct 26 12:36:30 nsg2 kernel:  [<ffffffff812e7d8e>] ? common_interrupt+0xe/0x13
Oct 26 12:36:30 nsg2 kernel:  [<ffffffff8112e9f7>] ? cap_socket_sendmsg+0x0/0x3
Oct 26 12:36:30 nsg2 kernel:  [<ffffffff81256bbb>] ? sock_aio_write+0xf5/0x10d
Oct 26 12:36:30 nsg2 kernel:  [<ffffffff810029ae>] ? reschedule_interrupt+0xe/0x20
Oct 26 12:36:30 nsg2 kernel:  [<ffffffff812e7d8e>] ? common_interrupt+0xe/0x13
Oct 26 12:36:30 nsg2 kernel:  [<ffffffff812e7d8e>] ? common_interrupt+0xe/0x13
Oct 26 12:36:30 nsg2 kernel:  [<ffffffff810b9b49>] ? do_sync_write+0xab/0xeb
Oct 26 12:36:30 nsg2 kernel:  [<ffffffff812e7abf>] ? _raw_spin_unlock_irq+0x9/0xd
Oct 26 12:36:30 nsg2 kernel:  [<ffffffff8112e83f>] ? security_file_permission+0x18/0x6b
Oct 26 12:36:30 nsg2 kernel:  [<ffffffff810ba1f7>] ? vfs_write+0xbe/0x132
Oct 26 12:36:30 nsg2 kernel:  [<ffffffff810ba754>] ? sys_write+0x45/0x6e
Oct 26 12:36:30 nsg2 kernel:  [<ffffffff81001e6b>] ? system_call_fastpath+0x16/0x1b
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: can't get IB link with the for-next branch / IBoE patches
       [not found]                                           ` <4CC6C75F.8030103-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
@ 2010-10-26 13:10                                             ` Or Gerlitz
  2010-10-26 13:46                                             ` Eli Cohen
  1 sibling, 0 replies; 29+ messages in thread
From: Or Gerlitz @ 2010-10-26 13:10 UTC (permalink / raw)
  To: Roland Dreier, Eli Cohen; +Cc: RDMA list

> doing a little bit stress testing, I came across the below oops, when running IPoIB
> and couple of iperf/udp sessions, it doesn't look like a problem in the IB stack.

To trigger this I run from client node the following "iperf -uc 192.168.21.18 -l 64000 -t 72000 -i 1 -b 40g  -d -P 4" where the server node (21.18 here) was the one that has the IBoE patches and got this oops 

> Oct 26 12:36:30 nsg2 kernel: BUG: spinlock bad magic on CPU#0, iperf/20845
> Oct 26 12:36:30 nsg2 kernel:  lock: ffffffff81663ef8, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
> Oct 26 12:36:30 nsg2 kernel: Pid: 20845, comm: iperf Not tainted 2.6.36-rc5-42052-gce806e1 #1
> Oct 26 12:36:30 nsg2 kernel: Call Trace:
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff811542b7>] ? do_raw_spin_lock+0x22/0x122
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff81268b2b>] ? dev_queue_xmit+0x10d/0x346
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff8128ca13>] ? ip_push_pending_frames+0x2bf/0x318
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff812a7e66>] ? udp_push_pending_frames+0x2d2/0x351
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff812a970c>] ? udp_sendmsg+0x4b0/0x59c
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff8112e9f7>] ? cap_socket_sendmsg+0x0/0x3
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff812e7d8e>] ? common_interrupt+0xe/0x13
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff8112e9f7>] ? cap_socket_sendmsg+0x0/0x3
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff81256bbb>] ? sock_aio_write+0xf5/0x10d
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff810029ae>] ? reschedule_interrupt+0xe/0x20
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff812e7d8e>] ? common_interrupt+0xe/0x13
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff812e7d8e>] ? common_interrupt+0xe/0x13
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff810b9b49>] ? do_sync_write+0xab/0xeb
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff812e7abf>] ? _raw_spin_unlock_irq+0x9/0xd
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff8112e83f>] ? security_file_permission+0x18/0x6b
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff810ba1f7>] ? vfs_write+0xbe/0x132
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff810ba754>] ? sys_write+0x45/0x6e
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff81001e6b>] ? system_call_fastpath+0x16/0x1b

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: can't get IB link with the for-next branch / IBoE patches
       [not found]                                           ` <4CC6C75F.8030103-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
  2010-10-26 13:10                                             ` Or Gerlitz
@ 2010-10-26 13:46                                             ` Eli Cohen
  1 sibling, 0 replies; 29+ messages in thread
From: Eli Cohen @ 2010-10-26 13:46 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Roland Dreier, RDMA list, Andy Grover

I'll try to reproduce here. Anyway, I don't see how it could possibly
be related to IBoE.

On Tue, Oct 26, 2010 at 02:19:43PM +0200, Or Gerlitz wrote:
> > I have IB port coming to active and basic IPoIB, opensm working okay
> > on the node with the current for-next/IBoE bits
> 
> doing a little bit stress testing, I came across the below oops, when running IPoIB
> and couple of iperf/udp sessions, it doesn't look like a problem in the IB stack.
> 
> Also with rds, using rds-stress from rds-tools-1.5-1.el5 and "rds-stress -s 192.168.20.18 -p 4000 -t 1 
> -q 1K -a 1K -D 1M" on the client side, the node running the for-next/IBoE bits and acting as the passive side
> of the test, got hanged. Also here, this could be a bug in RDS and not in the IBoE patches, I know that the rds guys queued about a hundred! patches for 2.6.37 so with these patches things might be better. I have the oops trace in jpg, will send to Andy, Roland and Eli. I guess we can continue these tests for 2-3 days and have the push over the weekend, or push it before and get fixes if needed through the -rc cycle.
> 
> 
> Oct 26 12:36:30 nsg2 kernel: BUG: spinlock bad magic on CPU#0, iperf/20845
> Oct 26 12:36:30 nsg2 kernel:  lock: ffffffff81663ef8, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
> Oct 26 12:36:30 nsg2 kernel: Pid: 20845, comm: iperf Not tainted 2.6.36-rc5-42052-gce806e1 #1
> Oct 26 12:36:30 nsg2 kernel: Call Trace:
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff811542b7>] ? do_raw_spin_lock+0x22/0x122
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff81268b2b>] ? dev_queue_xmit+0x10d/0x346
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff8128ca13>] ? ip_push_pending_frames+0x2bf/0x318
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff812a7e66>] ? udp_push_pending_frames+0x2d2/0x351
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff812a970c>] ? udp_sendmsg+0x4b0/0x59c
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff8112e9f7>] ? cap_socket_sendmsg+0x0/0x3
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff812e7d8e>] ? common_interrupt+0xe/0x13
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff8112e9f7>] ? cap_socket_sendmsg+0x0/0x3
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff81256bbb>] ? sock_aio_write+0xf5/0x10d
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff810029ae>] ? reschedule_interrupt+0xe/0x20
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff812e7d8e>] ? common_interrupt+0xe/0x13
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff812e7d8e>] ? common_interrupt+0xe/0x13
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff810b9b49>] ? do_sync_write+0xab/0xeb
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff812e7abf>] ? _raw_spin_unlock_irq+0x9/0xd
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff8112e83f>] ? security_file_permission+0x18/0x6b
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff810ba1f7>] ? vfs_write+0xbe/0x132
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff810ba754>] ? sys_write+0x45/0x6e
> Oct 26 12:36:30 nsg2 kernel:  [<ffffffff81001e6b>] ? system_call_fastpath+0x16/0x1b
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2010-10-26 13:46 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-24  7:58 [PATCH] mlx4: Fix unneeded return error in eth_link_query_port Eli Cohen
2010-10-24 15:42 ` Roland Dreier
     [not found]   ` <adahbgbppgx.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-10-24 16:00     ` Eli Cohen
2010-10-24 16:22       ` Roland Dreier
     [not found]         ` <adaocajo90s.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-10-24 18:05           ` Eli Cohen
     [not found]             ` <AANLkTimb++kFYFXCWajBGACpA1OpCXyyeyD-98Ed3uTu-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-10-25  4:12               ` Roland Dreier
     [not found]                 ` <adad3qync5m.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-10-25  8:18                   ` Eli Cohen
2010-10-25  9:26               ` Or Gerlitz
     [not found]                 ` <4CC54D4E.7050203-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
2010-10-25  9:33                   ` Or Gerlitz
     [not found]                     ` <4CC54ED7.6030303-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
2010-10-25 11:37                       ` Eli Cohen
2010-10-25 10:47                   ` can't get IB link with the for-next branch / IBoE patches (was "mlx4: Fix unneeded return error...") Or Gerlitz
     [not found]                     ` <4CC5604D.2080803-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
2010-10-25 13:36                       ` can't get IB link with the for-next branch / IBoE patches Roland Dreier
     [not found]                         ` <adaaam2jswk.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-10-25 16:17                           ` Eli Cohen
2010-10-25 16:45                             ` Or Gerlitz
2010-10-25 17:23                             ` Roland Dreier
     [not found]                               ` <adalj5mgp96.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-10-25 17:35                                 ` Roland Dreier
     [not found]                                   ` <adahbgagopw.fsf-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>
2010-10-26  9:33                                     ` Or Gerlitz
     [not found]                                       ` <4CC6A051.3010703-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
2010-10-26 12:19                                         ` Or Gerlitz
     [not found]                                           ` <4CC6C75F.8030103-hKgKHo2Ms0FWk0Htik3J/w@public.gmane.org>
2010-10-26 13:10                                             ` Or Gerlitz
2010-10-26 13:46                                             ` Eli Cohen
2010-10-25 10:47                   ` can't get IB link with the for-next branch / IBoE patches (was "mlx4: Fix unneeded return error...") Or Gerlitz
2010-10-25 11:34                   ` [PATCH] mlx4: Fix unneeded return error in eth_link_query_port Eli Cohen
2010-10-25 14:15                     ` Or Gerlitz
     [not found]                       ` <AANLkTi=ZxB4b463OOS6YGxTJSKxjyCj8vy0rNtj0n+uA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-10-25 14:36                         ` Eli Cohen
2010-10-25 16:46                           ` Or Gerlitz
     [not found]                             ` <AANLkTimaEcFZMnYE+G3osTWzPkfxuBpRMtrrXF7xUPYv-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-10-25 17:13                               ` Eli Cohen
2010-10-25 19:04                                 ` Or Gerlitz
     [not found]                                   ` <AANLkTik_4OzMLMWXud89m_rF47OQ3Wji9R_Bye+0DcTV-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-10-25 19:15                                     ` Eli Cohen
     [not found]                                       ` <AANLkTi=yZUoexwVUCfbeGypEWC_8=oZaMu9mBTF+VJgq-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-10-25 19:38                                         ` Or Gerlitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.