3.10.0-rc2 mlx4 not receiving packets for some multicast groups

All of lore.kernel.org
 help / color / mirror / Atom feed

* 3.10.0-rc2 mlx4 not receiving packets for some multicast groups
@ 2013-05-24 15:49 Shawn Bohrer
  2013-05-24 16:34 ` Shawn Bohrer
  2013-05-25  3:49 ` Or Gerlitz
  0 siblings, 2 replies; 16+ messages in thread
From: Shawn Bohrer @ 2013-05-24 15:49 UTC (permalink / raw)
  To: netdev; +Cc: Or Gerlitz, Hadar Hen Zion, Rony Efraim, Amir Vadai

I just started testing the 3.10 kernel, previously we were on 3.4 so
there is a fairly large jump.  I've additionally applied the following
four patches to the 3.10.0-rc2 kernel that I'm testing:

https://patchwork.kernel.org/patch/2484651/
https://patchwork.kernel.org/patch/2484671/
https://patchwork.kernel.org/patch/2484681/
https://patchwork.kernel.org/patch/2484641/

I don't know if those patches are related to my issues or not but I
plan on trying to reproduce without them soon.

The issue I'm seeing is that our applications listen on a number of
multicast addresses.  In this case I'm listening to about 350
different addresses per machine, across many different processes, with
usually one socket per address.  The problem is that some of the
sockets are not receiving any data and some are, even though they all
should be.  If I put the device in promiscuous mode then I start
receiving data on all of my sockets.  Running netstat -g shows all of
my memberships so it appears to me that the kernel and the switch
think I've joined the groups, but the card may be filtering the data.
This is with:

05:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]

# ethtool -i eth4
driver: mlx4_en
version: 2.0 (Dec 2011)
firmware-version: 2.11.500
bus-info: 0000:05:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no

The other strange part is that I've got multiple machines all running
the same kernel and not all of them are experiencing the issue.  At
one point they were all working fine, but the issue appeared after I
rebooted one of the machines and multiple reboots later it is still in
this bad state.  Rebooting that machine back to 3.4 causes it to work
as expected but no luck under 3.10.  I've now got two machines in this
bad state and they both started immediately after a reboot.

Does anyone have any ideas?

Thanks,
Shawn

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups
  2013-05-24 15:49 3.10.0-rc2 mlx4 not receiving packets for some multicast groups Shawn Bohrer
@ 2013-05-24 16:34 ` Shawn Bohrer
  2013-05-24 16:58   ` Eric Dumazet
  2013-05-25  3:41   ` Or Gerlitz
  2013-05-25  3:49 ` Or Gerlitz
  1 sibling, 2 replies; 16+ messages in thread
From: Shawn Bohrer @ 2013-05-24 16:34 UTC (permalink / raw)
  To: netdev; +Cc: Or Gerlitz, Hadar Hen Zion, Rony Efraim, Amir Vadai

On Fri, May 24, 2013 at 10:49:31AM -0500, Shawn Bohrer wrote:
> I just started testing the 3.10 kernel, previously we were on 3.4 so
> there is a fairly large jump.  I've additionally applied the following
> four patches to the 3.10.0-rc2 kernel that I'm testing:
> 
> https://patchwork.kernel.org/patch/2484651/
> https://patchwork.kernel.org/patch/2484671/
> https://patchwork.kernel.org/patch/2484681/
> https://patchwork.kernel.org/patch/2484641/
> 
> I don't know if those patches are related to my issues or not but I
> plan on trying to reproduce without them soon.

I've reverted the four patches above from my test kernel and still see
the issue so they don't appear to be the cause.

--
Shawn

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups
  2013-05-24 16:34 ` Shawn Bohrer
@ 2013-05-24 16:58   ` Eric Dumazet
  2013-05-25  3:41   ` Or Gerlitz
  1 sibling, 0 replies; 16+ messages in thread
From: Eric Dumazet @ 2013-05-24 16:58 UTC (permalink / raw)
  To: Shawn Bohrer; +Cc: netdev, Or Gerlitz, Hadar Hen Zion, Rony Efraim, Amir Vadai

On Fri, 2013-05-24 at 11:34 -0500, Shawn Bohrer wrote:
> On Fri, May 24, 2013 at 10:49:31AM -0500, Shawn Bohrer wrote:
> > I just started testing the 3.10 kernel, previously we were on 3.4 so
> > there is a fairly large jump.  I've additionally applied the following
> > four patches to the 3.10.0-rc2 kernel that I'm testing:
> > 
> > https://patchwork.kernel.org/patch/2484651/
> > https://patchwork.kernel.org/patch/2484671/
> > https://patchwork.kernel.org/patch/2484681/
> > https://patchwork.kernel.org/patch/2484641/
> > 
> > I don't know if those patches are related to my issues or not but I
> > plan on trying to reproduce without them soon.
> 
> I've reverted the four patches above from my test kernel and still see
> the issue so they don't appear to be the cause.

I suggest adding in tools/testing/selftests/net tests about multicast
stuff.

It seems many NIC suffer from bugs in this area, especially when dealing
with a lot of groups.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups
  2013-05-24 16:34 ` Shawn Bohrer
  2013-05-24 16:58   ` Eric Dumazet
@ 2013-05-25  3:41   ` Or Gerlitz
  2013-05-25 15:13     ` Shawn Bohrer
  1 sibling, 1 reply; 16+ messages in thread
From: Or Gerlitz @ 2013-05-25  3:41 UTC (permalink / raw)
  To: Shawn Bohrer; +Cc: netdev, Hadar Hen Zion, Amir Vadai

On Fri, May 24, 2013 at 7:34 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote:
> On Fri, May 24, 2013 at 10:49:31AM -0500, Shawn Bohrer wrote:
> > I just started testing the 3.10 kernel, previously we were on 3.4 so
> > there is a fairly large jump.  I've additionally applied the following
> > four patches to the 3.10.0-rc2 kernel that I'm testing:
> >
> > https://patchwork.kernel.org/patch/2484651/
> > https://patchwork.kernel.org/patch/2484671/
> > https://patchwork.kernel.org/patch/2484681/
> > https://patchwork.kernel.org/patch/2484641/
> >
>> I don't know if those patches are related to my issues or not but I
>> plan on trying to reproduce without them soon.

> I've reverted the four patches above from my test kernel and still see
> the issue so they don't appear to be the cause.

Hi Shawn,

So 3.4 works, 3.10-rc2 breaks? its indeed a fairly large gap, maybe
try to bisec that? just to make sure, did use touch any mlx4
non-default config? specifically did you turn DMFS (Device Managed
Flow Steering) on using the  set the mlx4_core module param of
log_num_mgm_entry_size or you were using B0 steering (the default)?

Or.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups
  2013-05-24 15:49 3.10.0-rc2 mlx4 not receiving packets for some multicast groups Shawn Bohrer
  2013-05-24 16:34 ` Shawn Bohrer
@ 2013-05-25  3:49 ` Or Gerlitz
  2013-05-25 14:02   ` Shawn Bohrer
  1 sibling, 1 reply; 16+ messages in thread
From: Or Gerlitz @ 2013-05-25  3:49 UTC (permalink / raw)
  To: Shawn Bohrer; +Cc: netdev, Hadar Hen Zion, Rony Efraim, Amir Vadai

On Fri, May 24, 2013 at 6:49 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote:
> I just started testing the 3.10 kernel, previously we were on 3.4 so
> there is a fairly large jump.
[...]

> 05:00.0 Network controller: Mellanox Technologies MT27500 Family
> [ConnectX-3]
>
> # ethtool -i eth4
> driver: mlx4_en
> version: 2.0 (Dec 2011)
> firmware-version: 2.11.500

Did you change firmware between the point it was working to where you are now?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups
  2013-05-25  3:49 ` Or Gerlitz
@ 2013-05-25 14:02   ` Shawn Bohrer
  0 siblings, 0 replies; 16+ messages in thread
From: Shawn Bohrer @ 2013-05-25 14:02 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: netdev, Hadar Hen Zion, Rony Efraim, Amir Vadai

On Sat, May 25, 2013 at 06:49:22AM +0300, Or Gerlitz wrote:
> On Fri, May 24, 2013 at 6:49 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote:
> > I just started testing the 3.10 kernel, previously we were on 3.4 so
> > there is a fairly large jump.
> [...]
> 
> > 05:00.0 Network controller: Mellanox Technologies MT27500 Family
> > [ConnectX-3]
> >
> > # ethtool -i eth4
> > driver: mlx4_en
> > version: 2.0 (Dec 2011)
> > firmware-version: 2.11.500
> 
> Did you change firmware between the point it was working to where you are now?

Nope, we've been using 2.11.500 for a while now.

--
Shawn

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups
  2013-05-25  3:41   ` Or Gerlitz
@ 2013-05-25 15:13     ` Shawn Bohrer
  2013-05-25 19:41       ` Or Gerlitz
  2013-05-28 20:15       ` Shawn Bohrer
  0 siblings, 2 replies; 16+ messages in thread
From: Shawn Bohrer @ 2013-05-25 15:13 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: netdev, Hadar Hen Zion, Amir Vadai

On Sat, May 25, 2013 at 06:41:05AM +0300, Or Gerlitz wrote:
> On Fri, May 24, 2013 at 7:34 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote:
> > On Fri, May 24, 2013 at 10:49:31AM -0500, Shawn Bohrer wrote:
> > > I just started testing the 3.10 kernel, previously we were on 3.4 so
> > > there is a fairly large jump.  I've additionally applied the following
> > > four patches to the 3.10.0-rc2 kernel that I'm testing:
> > >
> > > https://patchwork.kernel.org/patch/2484651/
> > > https://patchwork.kernel.org/patch/2484671/
> > > https://patchwork.kernel.org/patch/2484681/
> > > https://patchwork.kernel.org/patch/2484641/
> > >
> >> I don't know if those patches are related to my issues or not but I
> >> plan on trying to reproduce without them soon.
> 
> > I've reverted the four patches above from my test kernel and still see
> > the issue so they don't appear to be the cause.
> 
> Hi Shawn,
> 
> So 3.4 works, 3.10-rc2 breaks? its indeed a fairly large gap, maybe
> try to bisec that? just to make sure, did use touch any mlx4
> non-default config? specifically did you turn DMFS (Device Managed
> Flow Steering) on using the  set the mlx4_core module param of
> log_num_mgm_entry_size or you were using B0 steering (the default)?

Initially my goal is to sanity check 3.10 before I start playing with
the knobs, so I haven't explicitly changed any new mlx4 settings yet.
We do however set some non-default values but I'm doing that on both
kernels:

mlx4_core log_num_vlan=7
mlx4_en pfctx=0xff pfcrx=0xff

I may indeed try to bisect this, but first I need to see how easily I
can reproduce it.  I did some more testing last night that left me
feeling certifiably insane.  I'll explain what I saw with hopes that
either it will confirm I'm insane or maybe actually make sense to
someone...  My testing of 3.10 has basically gone like this:

1. I have 40 test machines.  I installed 3.10.0-rc2 on machine 1,
rebooted, and it came back without any fireworks so I installed
3.10.0-rc2 on the remaining 39 machines and rebooted them all in one
shot.
2. I then started my test applications which appeared everything was
functioning correctly on all machines.  There were some pretty
significant end-to-end latency regressions in our system so I started
to narrow down where the added latency might be coming from
(interrupts, memory, disk, scheduler, send/receive...).
3. 6 of my 40 machines are configured to receive the same data on
approximately 350 multicast groups.  I picked machine #1 built a new
kernel disabling the new adaptive NO_HZ and RCU no CB settings and
rebooted that machine.  When I re-ran my application machine #1 was
now only receiving data on a small fraction of the multicast groups.
4. After puzzling over machine #1 I decided to reboot machine #2 to
see if it was the reboot or the new kernel or maybe something else.
When machine #2 came back it was in the same state as machine #1 and
only received multicast data on a small number of the 350 groups.
This meant it wasn't my config change but the reboot that triggered
the issue.
5. Debugging I noticed that tcpdump on machine #1 or #2 caused them to
suddenly receive data, and simply putting the interface in promiscuous
mode had the same result.  I rebooted both machine #1 and #2 several
times and each time they had the same issue.  I then rebooted them
back into 3.4 and they both functioned as expected and received data
on all 350 groups.  Rebooted them both back into 3.10 and they were
both still broken.  This is when I sent my initial email to netdev.

*Here is where I went insane*
6. I still had 6 machines all configured the same and receiving the
same data.  #1 and #2 were still broken so I decided to see what would
happen if I simply rebooted #3.  I rebooted #3 started my application
and as I sort of expected #3 no longer received data on most of the
multicast groups.  The crazy part was that machine #1 was now working!
I didn't touch that machine at all, just stopped, and restarted my
application.
7. Confused I rebooted #4.  Again machine #4 was now broken, and
magically machine #2 started working.
8. When I rebooted machine #5 it came back and received all of the
data, but it also magically fixed #3.
9. At this point my brain was fried and it was time to go home so I
rebooted all machines back to 3.4 and gave up.

I'll revisit this again next week.

Thanks,
Shawn

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups
  2013-05-25 15:13     ` Shawn Bohrer
@ 2013-05-25 19:41       ` Or Gerlitz
  2013-05-25 21:37         ` Shawn Bohrer
  2013-05-28 20:15       ` Shawn Bohrer
  1 sibling, 1 reply; 16+ messages in thread
From: Or Gerlitz @ 2013-05-25 19:41 UTC (permalink / raw)
  To: Shawn Bohrer; +Cc: netdev, Hadar Hen Zion, Amir Vadai

On Sat, May 25, 2013 at 6:13 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote:
[...]
> 5. Debugging I noticed that tcpdump on machine #1 or #2 caused them to
> suddenly receive data, and simply putting the interface in promiscuous
> mode had the same result.  I rebooted both machine #1 and #2 several
> times and each time they had the same issue.  I then rebooted them
> back into 3.4 and they both functioned as expected and received data
> on all 350 groups.  Rebooted them both back into 3.10 and they were
> both still broken.  This is when I sent my initial email to netdev.
[..]

Shawn, thanks for all the details, just one small confirmation, when
you moved from 3.4 to 3.10 have you done ANY change to your app? e.g
do you still use the same QP type (UD) or moved to RAW PACKET?

Or.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups
  2013-05-25 19:41       ` Or Gerlitz
@ 2013-05-25 21:37         ` Shawn Bohrer
  0 siblings, 0 replies; 16+ messages in thread
From: Shawn Bohrer @ 2013-05-25 21:37 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: netdev, Hadar Hen Zion, Amir Vadai

On Sat, May 25, 2013 at 10:41:39PM +0300, Or Gerlitz wrote:
> On Sat, May 25, 2013 at 6:13 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote:
> [...]
> > 5. Debugging I noticed that tcpdump on machine #1 or #2 caused them to
> > suddenly receive data, and simply putting the interface in promiscuous
> > mode had the same result.  I rebooted both machine #1 and #2 several
> > times and each time they had the same issue.  I then rebooted them
> > back into 3.4 and they both functioned as expected and received data
> > on all 350 groups.  Rebooted them both back into 3.10 and they were
> > both still broken.  This is when I sent my initial email to netdev.
> [..]
> 
> Shawn, thanks for all the details, just one small confirmation, when
> you moved from 3.4 to 3.10 have you done ANY change to your app? e.g
> do you still use the same QP type (UD) or moved to RAW PACKET?

No modifications have been made to the application.  In this case I'm
using plain old UDP multicast sockets over 10g Ethernet.

--
Shawn

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups
  2013-05-25 15:13     ` Shawn Bohrer
  2013-05-25 19:41       ` Or Gerlitz
@ 2013-05-28 20:15       ` Shawn Bohrer
  2013-05-29 13:55         ` Or Gerlitz
  1 sibling, 1 reply; 16+ messages in thread
From: Shawn Bohrer @ 2013-05-28 20:15 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: netdev, Hadar Hen Zion, Amir Vadai

On Sat, May 25, 2013 at 10:13:47AM -0500, Shawn Bohrer wrote:
> On Sat, May 25, 2013 at 06:41:05AM +0300, Or Gerlitz wrote:
> > On Fri, May 24, 2013 at 7:34 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote:
> > > On Fri, May 24, 2013 at 10:49:31AM -0500, Shawn Bohrer wrote:
> > > > I just started testing the 3.10 kernel, previously we were on 3.4 so
> > > > there is a fairly large jump.  I've additionally applied the following
> > > > four patches to the 3.10.0-rc2 kernel that I'm testing:
> > > >
> > > > https://patchwork.kernel.org/patch/2484651/
> > > > https://patchwork.kernel.org/patch/2484671/
> > > > https://patchwork.kernel.org/patch/2484681/
> > > > https://patchwork.kernel.org/patch/2484641/
> > > >
> > >> I don't know if those patches are related to my issues or not but I
> > >> plan on trying to reproduce without them soon.
> > 
> > > I've reverted the four patches above from my test kernel and still see
> > > the issue so they don't appear to be the cause.
> > 
> > Hi Shawn,
> > 
> > So 3.4 works, 3.10-rc2 breaks? its indeed a fairly large gap, maybe
> > try to bisec that? just to make sure, did use touch any mlx4
> > non-default config? specifically did you turn DMFS (Device Managed
> > Flow Steering) on using the  set the mlx4_core module param of
> > log_num_mgm_entry_size or you were using B0 steering (the default)?
> 
> Initially my goal is to sanity check 3.10 before I start playing with
> the knobs, so I haven't explicitly changed any new mlx4 settings yet.
> We do however set some non-default values but I'm doing that on both
> kernels:
> 
> mlx4_core log_num_vlan=7
> mlx4_en pfctx=0xff pfcrx=0xff

Naturally I was wrong and we set more than the above non-default
values.  We additionally set high_rate_steer=1 on mlx4_core. As
you may know this parameter isn't currently available in the upstream
driver, so I've been carrying the following patch in my 3.4 and 3.10
trees:

---
 drivers/net/ethernet/mellanox/mlx4/main.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 0d32a82..7808e4a 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -71,6 +71,11 @@ static int msi_x = 1;
 module_param(msi_x, int, 0444);
 MODULE_PARM_DESC(msi_x, "attempt to use MSI-X if nonzero");

+static int high_rate_steer;
+module_param(high_rate_steer, int, 0444);
+MODULE_PARM_DESC(high_rate_steer, "Enable steering mode for higher packet rate"
+                                  " (default off)");
+
 #else /* CONFIG_PCI_MSI */

 #define msi_x (0)
@@ -288,6 +293,11 @@ static int mlx4_dev_cap(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
 	if (mlx4_is_mfunc(dev))
 		dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_SENSE_SUPPORT;

+	if (high_rate_steer && !mlx4_is_mfunc(dev)) {
+		dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_VEP_UC_STEER;
+		dev->caps.flags &= ~MLX4_DEV_CAP_FLAG_VEP_MC_STEER;
+	}
+
 	dev->caps.log_num_macs  = log_num_mac;
 	dev->caps.log_num_vlans = MLX4_LOG_NUM_VLANS;
 	dev->caps.log_num_prios = use_prio ? 3 : 0;
-- 

What I've found really happened is:

1. Installed 3.10 rebooted and everything worked.  high_rate_steer=1
was set at this point.
2. Our configuration management software saw the new kernel and
disabled high_rate_steer.
3. As I rebooted machines high_rate_steer was cleared and they no
longer received multicast data on most of their addresses.

I've confirmed that with the above high_rate_steer patch and
high_rate_steer=1 I receive data on 3.10.0-rc3 and with
high_rate_steer=0 I only receive data on a small number of multicast
addresses.  With 3.4 and the same patch I receive data in both cases.

I also previously claimed that rebooting one machine appeared to make
a different machine receive data.  I doubt this was true.  Instead
what I think happened was that each time I start my application a
different set of multicast groups will receive data and the rest will
not.  I did not verify that all groups were actually receiving data
and thus am guessing I just happened to get lucky and see a few new
ones working that previously were not.

So now that we know that high_rate_steer=1 fixes my multicast issue
does that provide any clues as to why I do not receive data on all
multicast groups without it?  Additionally as I'm sure I should have
done earlier is there a reason the high_rate_steer option has not been
upstreamed?  I can see that the out of tree Mellanox driver now
additionally clears MLX4_DEV_CAP_FLAG2_FS_EN when high_rate_steer=1
and has moved that code into choose_steering_mode() so my local patch
probably needs an update if this isn't going upstream.  For a little
bit of background the reason we are using the high_rate_steer=1 option
was because it enabled us to handle larger/faster bursts of packets
without dropping packets.  Historically we got very similar results by
using log_num_mgm_entry_size=7 but we stuck with high_rate_steer=1
simply because we had tried/verified it first.  For those wondering
using log_num_mgm_entry_size=7 and high_rate_steer=0 on 3.10 does not
work since I do not receive data on all multicast groups.

--
Shawn

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups
  2013-05-28 20:15       ` Shawn Bohrer
@ 2013-05-29 13:55         ` Or Gerlitz
  2013-05-30 20:31           ` Shawn Bohrer
  0 siblings, 1 reply; 16+ messages in thread
From: Or Gerlitz @ 2013-05-29 13:55 UTC (permalink / raw)
  To: Shawn Bohrer; +Cc: netdev, Hadar Hen Zion, Amir Vadai

On Tue, May 28, 2013 at 11:15 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote:
> Naturally I was wrong and we set more than the above non-default
> values.  We additionally set high_rate_steer=1 on mlx4_core. As
> you may know this parameter isn't currently available in the upstream
> driver, so I've been carrying the following patch in my 3.4 and 3.10 trees:

[...]

> I've confirmed that with the above high_rate_steer patch and
> high_rate_steer=1 I receive data on 3.10.0-rc3 and with
> high_rate_steer=0 I only receive data on a small number of multicast
> addresses.  With 3.4 and the same patch I receive data in both cases.

[...]

Shawn, so end-in mind you want the NIC steering mode to be DMFS
(Device Managed Flow Steering) e.g for the processes bypassing the
kernel, correct? since the NIC steering mode is global, you will not
be able to use that non-upstream patch moving forward. So we need to
debug/bisect why without the patch (what you call high_rate_steer=0)
you don't get data on all groups. Can you bisect that on a single
node, e.g set the rest of the environment with 3.4 that works, and on
a given node see what is the commit that breaks that?

Or.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups
  2013-05-29 13:55         ` Or Gerlitz
@ 2013-05-30 20:31           ` Shawn Bohrer
  2013-05-30 20:42             ` Or Gerlitz
  0 siblings, 1 reply; 16+ messages in thread
From: Shawn Bohrer @ 2013-05-30 20:31 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: netdev, Hadar Hen Zion, Amir Vadai, Vlad Yasevich

On Wed, May 29, 2013 at 04:55:32PM +0300, Or Gerlitz wrote:
> On Tue, May 28, 2013 at 11:15 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote:
> > Naturally I was wrong and we set more than the above non-default
> > values.  We additionally set high_rate_steer=1 on mlx4_core. As
> > you may know this parameter isn't currently available in the upstream
> > driver, so I've been carrying the following patch in my 3.4 and 3.10 trees:
> 
> [...]
> 
> > I've confirmed that with the above high_rate_steer patch and
> > high_rate_steer=1 I receive data on 3.10.0-rc3 and with
> > high_rate_steer=0 I only receive data on a small number of multicast
> > addresses.  With 3.4 and the same patch I receive data in both cases.
> 
> [...]
> 
> Shawn, so end-in mind you want the NIC steering mode to be DMFS
> (Device Managed Flow Steering) e.g for the processes bypassing the
> kernel, correct? since the NIC steering mode is global, you will not
> be able to use that non-upstream patch moving forward.

Yes, end goal is to use DMFS.  However, we have some ConnectX-2 cards
which I guess do not support DMFS and naturally I'd like plain old UDP
multicast to continue to work at the same level as 3.4.  So I may
still want that high_rate_steer option upstreamed, but we'll see once
I get 3.10 into better shape.

> So we need to
> debug/bisect why without the patch (what you call high_rate_steer=0)
> you don't get data on all groups. Can you bisect that on a single
> node, e.g set the rest of the environment with 3.4 that works, and on
> a given node see what is the commit that breaks that?

Done. It appears that the patch that breaks receiving packets on many
different multicast groups/sockets is:

commit 4cd729b04285b7330edaf5a7080aa795d6d15ff3
Author: Vlad Yasevich <vyasevic@redhat.com>
Date:   Mon Apr 15 09:54:25 2013 +0000

    net: add dev_uc_sync_multiple() and dev_mc_sync_multiple() api
    
    The current implementation of dev_uc_sync/unsync() assumes that there is
    a strict 1-to-1 relationship between the source and destination of the sync.
    In other words, once an address has been synced to a destination device, it
    will not be synced to any other device through the sync API.
    However, there are some virtual devices that aggreate a number of lower
    devices and need to sync addresses to all of them.  The current
    API falls short there.
    
    This patch introduces a new dev_uc_sync_multiple() api that can be called
    in the above circumstances and allows sync to work for every invocation.
    
    CC: Jiri Pirko <jiri@resnulli.us>
    Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

I've confirmed that reverting this patch on top of 3.10-rc3 allows me
to receive packets on all of my multicast groups without the Mellanox
high_rate_steer option set.

--
Shawn

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups
  2013-05-30 20:31           ` Shawn Bohrer
@ 2013-05-30 20:42             ` Or Gerlitz
  2013-05-30 20:57               ` Vlad Yasevich
  0 siblings, 1 reply; 16+ messages in thread
From: Or Gerlitz @ 2013-05-30 20:42 UTC (permalink / raw)
  To: Shawn Bohrer, Vlad Yasevich
  Cc: netdev, Hadar Hen Zion, Amir Vadai, Jiri Pirko

On Thu, May 30, 2013 at 11:31 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote:
>> So we need to
>> debug/bisect why without the patch (what you call high_rate_steer=0)
>> you don't get data on all groups. Can you bisect that on a single
>> node, e.g set the rest of the environment with 3.4 that works, and on
>> a given node see what is the commit that breaks that?

> Done. It appears that the patch that breaks receiving packets on many
> different multicast groups/sockets is:
>
> commit 4cd729b04285b7330edaf5a7080aa795d6d15ff3
> Author: Vlad Yasevich <vyasevic@redhat.com>
> Date:   Mon Apr 15 09:54:25 2013 +0000
>
>     net: add dev_uc_sync_multiple() and dev_mc_sync_multiple() api
>
>     The current implementation of dev_uc_sync/unsync() assumes that there is
>     a strict 1-to-1 relationship between the source and destination of the sync.
>     In other words, once an address has been synced to a destination device, it
>     will not be synced to any other device through the sync API.
>     However, there are some virtual devices that aggreate a number of lower
>     devices and need to sync addresses to all of them.  The current
>     API falls short there.
>
>     This patch introduces a new dev_uc_sync_multiple() api that can be called
>     in the above circumstances and allows sync to work for every invocation.
>
>     CC: Jiri Pirko <jiri@resnulli.us>
>     Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
>     Signed-off-by: David S. Miller <davem@davemloft.net>
>
> I've confirmed that reverting this patch on top of 3.10-rc3 allows me
> to receive packets on all of my multicast groups without the Mellanox
> high_rate_steer option set.

OK, impressive debugging... so what do we do from here? Vlad, Shawn
observes a regression once this patch is used on a large scale setup
that uses many multicast groups (you can read the posts done earlier
on this thread), does this rings any bell w.r.t to the actual problem
in the patch?

Or.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups
  2013-05-30 20:42             ` Or Gerlitz
@ 2013-05-30 20:57               ` Vlad Yasevich
  2013-05-31  0:23                 ` Jay Vosburgh
  0 siblings, 1 reply; 16+ messages in thread
From: Vlad Yasevich @ 2013-05-30 20:57 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Shawn Bohrer, netdev, Hadar Hen Zion, Amir Vadai, Jiri Pirko

On 05/30/2013 04:42 PM, Or Gerlitz wrote:
> On Thu, May 30, 2013 at 11:31 PM, Shawn Bohrer <shawn.bohrer@gmail.com> wrote:
>>> So we need to
>>> debug/bisect why without the patch (what you call high_rate_steer=0)
>>> you don't get data on all groups. Can you bisect that on a single
>>> node, e.g set the rest of the environment with 3.4 that works, and on
>>> a given node see what is the commit that breaks that?
>
>> Done. It appears that the patch that breaks receiving packets on many
>> different multicast groups/sockets is:
>>
>> commit 4cd729b04285b7330edaf5a7080aa795d6d15ff3
>> Author: Vlad Yasevich <vyasevic@redhat.com>
>> Date:   Mon Apr 15 09:54:25 2013 +0000
>>
>>      net: add dev_uc_sync_multiple() and dev_mc_sync_multiple() api
>>
>>      The current implementation of dev_uc_sync/unsync() assumes that there is
>>      a strict 1-to-1 relationship between the source and destination of the sync.
>>      In other words, once an address has been synced to a destination device, it
>>      will not be synced to any other device through the sync API.
>>      However, there are some virtual devices that aggreate a number of lower
>>      devices and need to sync addresses to all of them.  The current
>>      API falls short there.
>>
>>      This patch introduces a new dev_uc_sync_multiple() api that can be called
>>      in the above circumstances and allows sync to work for every invocation.
>>
>>      CC: Jiri Pirko <jiri@resnulli.us>
>>      Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
>>      Signed-off-by: David S. Miller <davem@davemloft.net>
>>
>> I've confirmed that reverting this patch on top of 3.10-rc3 allows me
>> to receive packets on all of my multicast groups without the Mellanox
>> high_rate_steer option set.
>
> OK, impressive debugging... so what do we do from here? Vlad, Shawn
> observes a regression once this patch is used on a large scale setup
> that uses many multicast groups (you can read the posts done earlier
> on this thread), does this rings any bell w.r.t to the actual problem
> in the patch?

I haven't seen that, but I didn't test with that many multicast groups. 
  I had 20 groups working.

I'll take a look and see what might be going on.

Thanks
-vlad

>
> Or.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups
  2013-05-30 20:57               ` Vlad Yasevich
@ 2013-05-31  0:23                 ` Jay Vosburgh
  2013-05-31 15:17                   ` Shawn Bohrer
  0 siblings, 1 reply; 16+ messages in thread
From: Jay Vosburgh @ 2013-05-31  0:23 UTC (permalink / raw)
  To: vyasevic
  Cc: Or Gerlitz, Shawn Bohrer, netdev, Hadar Hen Zion, Amir Vadai, Jiri Pirko

Vlad Yasevich <vyasevic@redhat.com> wrote:

>>>      CC: Jiri Pirko <jiri@resnulli.us>
>>>      Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
>>>      Signed-off-by: David S. Miller <davem@davemloft.net>
>>>
>>> I've confirmed that reverting this patch on top of 3.10-rc3 allows me
>>> to receive packets on all of my multicast groups without the Mellanox
>>> high_rate_steer option set.
>>
>> OK, impressive debugging... so what do we do from here? Vlad, Shawn
>> observes a regression once this patch is used on a large scale setup
>> that uses many multicast groups (you can read the posts done earlier
>> on this thread), does this rings any bell w.r.t to the actual problem
>> in the patch?
>
>I haven't seen that, but I didn't test with that many multicast groups. I
>had 20 groups working.
>
>I'll take a look and see what might be going on.

	I've actually been porting bonding to the dev_sync/unsync
system, and have a patch series of 4 fixes to various internals of
dev_sync/unsync; I'll post those under separate cover.  It may be that
one or more of those things are the source of this problem (or I might
have it all wrong).

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: 3.10.0-rc2 mlx4 not receiving packets for some multicast groups
  2013-05-31  0:23                 ` Jay Vosburgh
@ 2013-05-31 15:17                   ` Shawn Bohrer
  0 siblings, 0 replies; 16+ messages in thread
From: Shawn Bohrer @ 2013-05-31 15:17 UTC (permalink / raw)
  To: Jay Vosburgh
  Cc: vyasevic, Or Gerlitz, netdev, Hadar Hen Zion, Amir Vadai, Jiri Pirko

On Thu, May 30, 2013 at 05:23:20PM -0700, Jay Vosburgh wrote:
> Vlad Yasevich <vyasevic@redhat.com> wrote:
> 
> >>>      CC: Jiri Pirko <jiri@resnulli.us>
> >>>      Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
> >>>      Signed-off-by: David S. Miller <davem@davemloft.net>
> >>>
> >>> I've confirmed that reverting this patch on top of 3.10-rc3 allows me
> >>> to receive packets on all of my multicast groups without the Mellanox
> >>> high_rate_steer option set.
> >>
> >> OK, impressive debugging... so what do we do from here? Vlad, Shawn
> >> observes a regression once this patch is used on a large scale setup
> >> that uses many multicast groups (you can read the posts done earlier
> >> on this thread), does this rings any bell w.r.t to the actual problem
> >> in the patch?
> >
> >I haven't seen that, but I didn't test with that many multicast groups. I
> >had 20 groups working.
> >
> >I'll take a look and see what might be going on.
> 
> 	I've actually been porting bonding to the dev_sync/unsync
> system, and have a patch series of 4 fixes to various internals of
> dev_sync/unsync; I'll post those under separate cover.  It may be that
> one or more of those things are the source of this problem (or I might
> have it all wrong).

Thanks Jay,

I've tested your 4 patches on top of Linus' tree and they do solve the
multicast issue I was seeing in this thread.

--
Shawn

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-05-31 15:17 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-24 15:49 3.10.0-rc2 mlx4 not receiving packets for some multicast groups Shawn Bohrer
2013-05-24 16:34 ` Shawn Bohrer
2013-05-24 16:58   ` Eric Dumazet
2013-05-25  3:41   ` Or Gerlitz
2013-05-25 15:13     ` Shawn Bohrer
2013-05-25 19:41       ` Or Gerlitz
2013-05-25 21:37         ` Shawn Bohrer
2013-05-28 20:15       ` Shawn Bohrer
2013-05-29 13:55         ` Or Gerlitz
2013-05-30 20:31           ` Shawn Bohrer
2013-05-30 20:42             ` Or Gerlitz
2013-05-30 20:57               ` Vlad Yasevich
2013-05-31  0:23                 ` Jay Vosburgh
2013-05-31 15:17                   ` Shawn Bohrer
2013-05-25  3:49 ` Or Gerlitz
2013-05-25 14:02   ` Shawn Bohrer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.