All of lore.kernel.org
 help / color / mirror / Atom feed
* [B.A.T.M.A.N.] Mesh hickups with 2014.3
@ 2015-06-11 12:19 Bjoern Franke
  2015-06-11 12:43 ` Hans-Werner Hilse
  2015-06-11 13:20 ` Marek Lindner
  0 siblings, 2 replies; 10+ messages in thread
From: Bjoern Franke @ 2015-06-11 12:19 UTC (permalink / raw)
  To: b.a.t.m.a.n; +Cc: freifunk-ol-dev

Hi,

in the last days we upgraded nearly all (~500) routers of our Freifunk
-mesh from 2013.4 to 2014.3.
The most things run fine again, but with some routers we have
connectivity issues as follows:

- gateway gets alfreddata from router
- router pingable via batctl
- ping via linklocal, ULA and public-IPv6 from gateway to router not
possible
- other routers in the same mesh can ping it via ULA etc (connected via
the same fastd-VPN)

Public-IPv6 is announced from a gateway via radvd into the mesh.

Unfortunately we are out of ideas where we can search for the reason.

Regards
bjo
 
-- 
xmpp bjo@schafweide.org 




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [B.A.T.M.A.N.] Mesh hickups with 2014.3
  2015-06-11 12:19 [B.A.T.M.A.N.] Mesh hickups with 2014.3 Bjoern Franke
@ 2015-06-11 12:43 ` Hans-Werner Hilse
  2015-06-12 15:20   ` Bjoern Franke
  2015-06-11 13:20 ` Marek Lindner
  1 sibling, 1 reply; 10+ messages in thread
From: Hans-Werner Hilse @ 2015-06-11 12:43 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

Hi,

Am 2015-06-11 14:19, schrieb Bjoern Franke:

> - gateway gets alfreddata from router

Continuously? So that directions seems fine then.

> - router pingable via batctl
> - ping via linklocal, ULA and public-IPv6 from gateway to router not
> possible

What does "not possible" mean? There's no reply?

> - other routers in the same mesh can ping it via ULA etc (connected via
> the same fastd-VPN)

So it's just the gateway that can't reach the "router"?

> Public-IPv6 is announced from a gateway via radvd into the mesh.

So even in that direction, NDP is fine. Seems only ICMPv6 echo requests 
are somewhat different then?

> Unfortunately we are out of ideas where we can search for the reason.

Look what's actually on the wire on both ends? Tcpdump and batctl 
tcpdump?

-hwh


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [B.A.T.M.A.N.] Mesh hickups with 2014.3
  2015-06-11 12:19 [B.A.T.M.A.N.] Mesh hickups with 2014.3 Bjoern Franke
  2015-06-11 12:43 ` Hans-Werner Hilse
@ 2015-06-11 13:20 ` Marek Lindner
  2015-06-11 14:44   ` Bjoern Franke
  1 sibling, 1 reply; 10+ messages in thread
From: Marek Lindner @ 2015-06-11 13:20 UTC (permalink / raw)
  To: b.a.t.m.a.n; +Cc: freifunk-ol-dev

[-- Attachment #1: Type: text/plain, Size: 607 bytes --]

On Thursday, June 11, 2015 14:19:23 Bjoern Franke wrote:
> - ping via linklocal, ULA and public-IPv6 from gateway to router not
> possible
> - other routers in the same mesh can ping it via ULA etc (connected via
> the same fastd-VPN)
> 
> Public-IPv6 is announced from a gateway via radvd into the mesh.
> 
> Unfortunately we are out of ideas where we can search for the reason.

Do you have the multicast optimizations enabled ? 2014.3.0 still has a known 
bug causing these optimizations to harm multicast traffic. Either disable this 
feature or upgrade to something newer than 2014.3.0.

Cheers,
Marek

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [B.A.T.M.A.N.] Mesh hickups with 2014.3
  2015-06-11 13:20 ` Marek Lindner
@ 2015-06-11 14:44   ` Bjoern Franke
  2015-06-12  2:53     ` Marek Lindner
  2015-06-12  4:32     ` Linus Lüssing
  0 siblings, 2 replies; 10+ messages in thread
From: Bjoern Franke @ 2015-06-11 14:44 UTC (permalink / raw)
  To: Marek Lindner, b.a.t.m.a.n; +Cc: freifunk-ol-dev

[-- Attachment #1: Type: text/plain, Size: 546 bytes --]

Hi Marek,


> Do you have the multicast optimizations enabled ? 2014.3.0 still has 
> a known 
> bug causing these optimizations to harm multicast traffic. Either 
> disable this 
> feature or upgrade to something newer than 2014.3.0.
> 

IThanks for your reply. Multicast optimizations are disabled, but we ha
ve multicast related errors in the logs:
br-mesh: Multicast hash table chain limit reached: bat0br-mesh: Cannot rehash multicast hash table, disabling snooping: bat0, 201, -22
Regardsbjo
--
xmpp bjo@schafweide.org 



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [B.A.T.M.A.N.] Mesh hickups with 2014.3
  2015-06-11 14:44   ` Bjoern Franke
@ 2015-06-12  2:53     ` Marek Lindner
  2015-06-12  4:32     ` Linus Lüssing
  1 sibling, 0 replies; 10+ messages in thread
From: Marek Lindner @ 2015-06-12  2:53 UTC (permalink / raw)
  To: b.a.t.m.a.n; +Cc: freifunk-ol-dev

[-- Attachment #1: Type: text/plain, Size: 593 bytes --]

On Thursday, June 11, 2015 16:44:14 Bjoern Franke wrote:
> Thanks for your reply. Multicast optimizations are disabled, but we ha
> ve multicast related errors in the logs:
> br-mesh: Multicast hash table chain limit reached: bat0br-mesh: Cannot
> rehash multicast hash table, disabling snooping: bat0, 201, -22

That message is not coming from batman-adv but the bridge code itself. To me 
it looks like the bridge code disables multicast once the hash table limit is 
reached. Use the search engine of your choice to get better info. Sounds 
definitely related to your issue.

Cheers,
Marek

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [B.A.T.M.A.N.] Mesh hickups with 2014.3
  2015-06-11 14:44   ` Bjoern Franke
  2015-06-12  2:53     ` Marek Lindner
@ 2015-06-12  4:32     ` Linus Lüssing
       [not found]       ` <1434121488.3296.8.camel@nord-west.org>
  1 sibling, 1 reply; 10+ messages in thread
From: Linus Lüssing @ 2015-06-12  4:32 UTC (permalink / raw)
  To: Bjoern Franke; +Cc: b.a.t.m.a.n, freifunk-ol-dev

On Thu, Jun 11, 2015 at 04:44:14PM +0200, Bjoern Franke wrote:
> Hi Marek,
> 
> 
> > Do you have the multicast optimizations enabled ? 2014.3.0 still has 
> > a known 
> > bug causing these optimizations to harm multicast traffic. Either 
> > disable this 
> > feature or upgrade to something newer than 2014.3.0.
> > 
> 
> IThanks for your reply. Multicast optimizations are disabled, but we ha
> ve multicast related errors in the logs:
> br-mesh: Multicast hash table chain limit reached: bat0br-mesh: Cannot rehash multicast hash table, disabling snooping: bat0, 201, -22
> Regardsbjo

There's a hash_max value for the bridge
(/sys/class/net/<br>/bridge/hash_max). By default it's rather
small, just 512 entries / multicast listeners, so it's expected that
with 500 nodes the bridge multicast snooping will shut down.

Nevertheless, even if the bridge deactivates its multicast
snooping that shouldn't cause trouble for ICMPv6.

Which firmware are you using, do you see ICMPv6 packets entering
bat0 on the gateway and leaving bat0 on the router?

Cheers, Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [B.A.T.M.A.N.] Mesh hickups with 2014.3
  2015-06-11 12:43 ` Hans-Werner Hilse
@ 2015-06-12 15:20   ` Bjoern Franke
  2015-06-12 20:39     ` Hans-Werner Hilse
  0 siblings, 1 reply; 10+ messages in thread
From: Bjoern Franke @ 2015-06-12 15:20 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

Hi hwh,




> Continuously? So that directions seems fine then.

Yep.




> What does "not possible" mean? There's no reply?

No reply, yes.




> So it's just the gateway that can't reach the "router"?

Correct.

> > Public-IPv6 is announced from a gateway via radvd into the mesh.
> 
> So even in that direction, NDP is fine. Seems only ICMPv6 echo 
> requests 
> are somewhat different then?

ip -6 neigh show says "FAILED".

> > Unfortunately we are out of ideas where we can search for the 
> > reason.
> 
> Look what's actually on the wire on both ends? Tcpdump and batctl 
> tcpdump?
> 
> 

On the gateway, there are neighbor solicitations from the "router" and
advertisements from the gateway.

At the moment, it's not possible to access the "router" to take a look
what's going wrong there.

Regards
bjo
-- 
xmpp bjo@schafweide.org 




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [B.A.T.M.A.N.] Mesh hickups with 2014.3
       [not found]       ` <1434121488.3296.8.camel@nord-west.org>
@ 2015-06-12 20:33         ` Linus Lüssing
  2015-06-14 10:20           ` Bjoern Franke
  0 siblings, 1 reply; 10+ messages in thread
From: Linus Lüssing @ 2015-06-12 20:33 UTC (permalink / raw)
  To: Bjoern Franke; +Cc: b.a.t.m.a.n, freifunk-ol-dev

On Fri, Jun 12, 2015 at 05:04:48PM +0200, Bjoern Franke wrote:
> Hi Linus,
> 
> > There's a hash_max value for the bridge
> > (/sys/class/net/<br>/bridge/hash_max). By default it's rather
> > small, just 512 entries / multicast listeners, so it's expected that
> > with 500 nodes the bridge multicast snooping will shut down.
> 
> Thanks for your reply, I increased the value now to 4096 on the
> gateways.
> 
> > Nevertheless, even if the bridge deactivates its multicast
> > snooping that shouldn't cause trouble for ICMPv6.
> > 
> > Which firmware are you using, do you see ICMPv6 packets entering
> > bat0 on the gateway and leaving bat0 on the router?
> 
> We are running gluon 2014.4 on the routers. 

Okay. In Gluon both the batman-adv multicast optimizations and
bridge multicast snooping are deactivated by default. So I think
it's probably an issue with the bridge on your gateway. Does
deactivating the batman-adv and bridge multicast stuff make a
difference?

Which kernel version and Linux distribution are you using on the
gateway?

> 
> I see solicitations and advertisements on the gateway:
> 16:59:54.567582 IP6 fd74:fdaa:9dc4:0:fa1a:67ff:feeb:c77e >
> ff02::1:ff40:1: ICMP6, neighbor solicitation, who has
> fd74:fdaa:9dc4::40:1, length 32
> 16:59:54.567661 IP6 fd74:fdaa:9dc4::40:1 >
> fd74:fdaa:9dc4:0:fa1a:67ff:feeb:c77e: ICMP6, neighbor advertisement,
> tgt is fd74:fdaa:9dc4::40:1, length 32

Measured on bat0 or on the bridge on top?

> 
> Unfortunately it's not possible to access the router at the moment to
> have look what's leaving on bat0 there.
> 
> Regards
> Björn

Cheers, Linus

PS: Please keep the mailinglists.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [B.A.T.M.A.N.] Mesh hickups with 2014.3
  2015-06-12 15:20   ` Bjoern Franke
@ 2015-06-12 20:39     ` Hans-Werner Hilse
  0 siblings, 0 replies; 10+ messages in thread
From: Hans-Werner Hilse @ 2015-06-12 20:39 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

Hi,

Am 2015-06-12 17:20, schrieb Bjoern Franke:
>> > Public-IPv6 is announced from a gateway via radvd into the mesh.
>> 
>> So even in that direction, NDP is fine. Seems only ICMPv6 echo
>> requests
>> are somewhat different then?
> 
> ip -6 neigh show says "FAILED".

My bad: I planned to write "RA" instead of NDP. In fact, NDP seems to be 
failing, as you have determined.

>> Look what's actually on the wire on both ends? Tcpdump and batctl
>> tcpdump?
> 
> On the gateway, there are neighbor solicitations from the "router" and
> advertisements from the gateway.

The advertisements are fine then, good.

I've seen the behaviour you're looking at with Linux bridge code and 
icmp snooping. Mind you: with *activated* icmp snooping. The NDP does 
not (does it?) register ICMP multicast with listener notices (or does it 
once, too early?), and the bridge code does then not consider the 
relevant end that should receive NDP traffic a destination for the 
multicast traffic that should have gotten there.

Of course, the question that evolves it whether the solicitation is 
actually received, then handled - answered to - by the gateway.

Did any network bridges come into play on the gateway(s) that hadn't 
been there before?

-hwh

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [B.A.T.M.A.N.] Mesh hickups with 2014.3
  2015-06-12 20:33         ` Linus Lüssing
@ 2015-06-14 10:20           ` Bjoern Franke
  0 siblings, 0 replies; 10+ messages in thread
From: Bjoern Franke @ 2015-06-14 10:20 UTC (permalink / raw)
  To: Linus Lüssing; +Cc: b.a.t.m.a.n, freifunk-ol-dev

Hi,

> 
> Okay. In Gluon both the batman-adv multicast optimizations and
> bridge multicast snooping are deactivated by default. So I think
> it's probably an issue with the bridge on your gateway. Does
> deactivating the batman-adv and bridge multicast stuff make a
> difference?

Unfortunately not.

> Which kernel version and Linux distribution are you using on the
> gateway?
> > 

3.16.0-4-amd64 / Debian Jessie. In the old setup (with 2013.4) it was
3.2.0-4-amd64 / wheezy

> 
> 
> Measured on bat0 or on the bridge on top?

On the bridge.

> 
> PS: Please keep the mailinglists.

Sorry, I thought it was sent off the list.

Regards
Bjoern

-- 
xmpp bjo@schafweide.org 




^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-06-14 10:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-11 12:19 [B.A.T.M.A.N.] Mesh hickups with 2014.3 Bjoern Franke
2015-06-11 12:43 ` Hans-Werner Hilse
2015-06-12 15:20   ` Bjoern Franke
2015-06-12 20:39     ` Hans-Werner Hilse
2015-06-11 13:20 ` Marek Lindner
2015-06-11 14:44   ` Bjoern Franke
2015-06-12  2:53     ` Marek Lindner
2015-06-12  4:32     ` Linus Lüssing
     [not found]       ` <1434121488.3296.8.camel@nord-west.org>
2015-06-12 20:33         ` Linus Lüssing
2015-06-14 10:20           ` Bjoern Franke

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.