All of lore.kernel.org
 help / color / mirror / Atom feed
* Batman_V Originator Loop Issue
@ 2020-07-07 19:47 lavincent15
  2020-07-08  9:15 ` Simon Wunderlich
  0 siblings, 1 reply; 10+ messages in thread
From: lavincent15 @ 2020-07-07 19:47 UTC (permalink / raw)
  To: b.a.t.m.a.n

Running batctl 2020.1-openwrt-1 [batman-adv: 2020.1-openwrt-2]

When running a two node network with one node connected to my lan and the other operating as an access point, my network works great. I can connect clients to my batman nodes and access my LAN.

When booting up a third node. My network works for 1 minute, then breaks down. My LAN cannot ping any of the batman nodes anymore.

I keep receiving messages like this: "[ 2900.755655] br-lan: received packet on bat0 with own address as source address (addr:8c:ae:4c:db:14:5c, vlan:0)" which signifies a bridge loop I think.

My originator messages look wrong as I can see my host originator messages along with all the neigbor nodes:

root@OpenWrt:/etc/config# batctl o -n
[B.A.T.M.A.N. adv 2020.1-openwrt-2, MainIF/MAC: mesh0/00:30:1a:4e:b8:26 (bat0/f2:07:f1:5f:e0:78 BATMAN_V)]
   Originator        last-seen ( throughput)  Nexthop           [outgoingIF]
 * 00:30:1a:4e:b8:18    0.570s (       86.7) 00:30:1a:4e:b8:2e [     mesh0]
   00:30:1a:4e:b8:18    0.570s (       21.6) 00:30:1a:4e:b8:18 [     mesh0]
 * 00:30:1a:4e:b8:2e    1.510s (      212.6) 00:30:1a:4e:b8:2e [     mesh0]
   00:30:1a:4e:b8:2e    1.510s (       38.9) 00:30:1a:4e:b8:18 [     mesh0]
   00:30:1a:4e:b8:26    1.510s (       38.9) 00:30:1a:4e:b8:18 [     mesh0]
 * 00:30:1a:4e:b8:26    1.510s (      108.9) 00:30:1a:4e:b8:2e [     mesh0]

root@OpenWrt:/etc/config# batctl n -n
[B.A.T.M.A.N. adv 2020.1-openwrt-2, MainIF/MAC: mesh0/00:30:1a:4e:b8:26 (bat0/f2:07:f1:5f:e0:78 BATMAN_V)]
IF             Neighbor              last-seen
00:30:1a:4e:b8:2e    0.490s (      179.0) [     mesh0]
00:30:1a:4e:b8:18    0.380s (       79.2) [     mesh0]



Here is my /etc/config/network:
config interface 'loopback'
        option ifname 'lo'
        option proto 'static'
        option ipaddr '127.0.0.1'
        option netmask '255.0.0.0'

config globals 'globals'
        option ula_prefix 'fdc4:e092:8929::/48'

config interface 'lan'
        option type 'bridge'
        option proto 'static'
        option ipaddr '192.168.0.32'
        option netmask '255.255.255.0'
        option ip6assign '60'
        option gateway '192.168.0.1'
        list dns '8.8.8.8'
        option ifname 'bat0 eth0'

config interface 'nwi_mesh0'
        option mtu '2304'
        option proto 'batadv_hardif'
        option master 'bat0'

config interface 'bat0'
        option proto 'batadv'
        option routing_algo 'BATMAN_V'
        option aggregated_ogms '1'
        option ap_isolation '0'
        option bonding '0'
        option fragmentation '1'
        option gw_mode 'server'
        option log_level '0'
        option orig_interval '1000'
        option bridge_loop_avoidance '1'
        option distributed_arp_table '1'
        option multicast_mode '1'
        option network_coding '0'
        option hop_penalty '30'
        option isolation_mark '0x00000000/0x00000000'


And here is my /etc/config/wireless
root@OpenWrt:/etc/config# cat wireless
config wifi-device 'radio0'
        option type 'mac80211'
        option channel '36'
        option hwmode '11a'
        option path 'soc0/soc/1ffc000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0'
        option htmode 'VHT80'

config wifi-iface 'mesh0'
        option device 'radio0'
        option ifname 'mesh0'
        option network 'nwi_mesh0'
        option mode 'mesh'
        option mesh_fwding '0'
        option mesh_id 'batman_mesh'
        option encryption 'none'

config wifi-iface 'wifinet0'
        option device 'radio0'
        option mode 'ap'
        option ssid 'N2-Lander'
        option encryption 'psk2'
        option key 'finnjamin'
        option ifname 'wlan0'
        option network 'lan'


Any and all help is greatly appreciated

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Batman_V Originator Loop Issue
  2020-07-07 19:47 Batman_V Originator Loop Issue lavincent15
@ 2020-07-08  9:15 ` Simon Wunderlich
  2020-07-08 14:44   ` lavincent15
  2020-07-08 15:26   ` lavincent15
  0 siblings, 2 replies; 10+ messages in thread
From: Simon Wunderlich @ 2020-07-08  9:15 UTC (permalink / raw)
  To: b.a.t.m.a.n; +Cc: lavincent15

[-- Attachment #1: Type: text/plain, Size: 4756 bytes --]

Hi Luke,

can you please describe which nodes are connected to the LAN and which are 
not? You say "one is connected to LAN" and the others are "operating as an 
access point", does that mean they are not connected to the same LAN via 
Ethernet?

If multiple nodes are connected and bridged to the same LAN, bridge loop 
avoidance should be enabled - you have that in your config, but you could 
double check with "batctl bl" and then "batctl bbt"/"batctl cl" (please post 
these tables if you think this could be connected).

You could also try disabling distributed arp table and multicast mode, just to 
make sure this is not shooting us in the foot here. Those optimizations are 
not really needed for such a small network.

Cheers,
       Simon

On Tuesday, July 7, 2020 9:47:31 PM CEST lavincent15@gmail.com wrote:
> Running batctl 2020.1-openwrt-1 [batman-adv: 2020.1-openwrt-2]
> 
> When running a two node network with one node connected to my lan and the
> other operating as an access point, my network works great. I can connect
> clients to my batman nodes and access my LAN.
> 
> When booting up a third node. My network works for 1 minute, then breaks
> down. My LAN cannot ping any of the batman nodes anymore.
> 
> I keep receiving messages like this: "[ 2900.755655] br-lan: received packet
> on bat0 with own address as source address (addr:8c:ae:4c:db:14:5c,
> vlan:0)" which signifies a bridge loop I think.
> 
> My originator messages look wrong as I can see my host originator messages
> along with all the neigbor nodes:
> 
> root@OpenWrt:/etc/config# batctl o -n
> [B.A.T.M.A.N. adv 2020.1-openwrt-2, MainIF/MAC: mesh0/00:30:1a:4e:b8:26
> (bat0/f2:07:f1:5f:e0:78 BATMAN_V)] Originator        last-seen (
> throughput)  Nexthop           [outgoingIF] * 00:30:1a:4e:b8:18    0.570s (
>       86.7) 00:30:1a:4e:b8:2e [     mesh0] 00:30:1a:4e:b8:18    0.570s (   
>    21.6) 00:30:1a:4e:b8:18 [     mesh0] * 00:30:1a:4e:b8:2e    1.510s (    
>  212.6) 00:30:1a:4e:b8:2e [     mesh0] 00:30:1a:4e:b8:2e    1.510s (      
> 38.9) 00:30:1a:4e:b8:18 [     mesh0] 00:30:1a:4e:b8:26    1.510s (      
> 38.9) 00:30:1a:4e:b8:18 [     mesh0] * 00:30:1a:4e:b8:26    1.510s (     
> 108.9) 00:30:1a:4e:b8:2e [     mesh0]
> 
> root@OpenWrt:/etc/config# batctl n -n
> [B.A.T.M.A.N. adv 2020.1-openwrt-2, MainIF/MAC: mesh0/00:30:1a:4e:b8:26
> (bat0/f2:07:f1:5f:e0:78 BATMAN_V)] IF             Neighbor             
> last-seen
> 00:30:1a:4e:b8:2e    0.490s (      179.0) [     mesh0]
> 00:30:1a:4e:b8:18    0.380s (       79.2) [     mesh0]
> 
> 
> 
> Here is my /etc/config/network:
> config interface 'loopback'
>         option ifname 'lo'
>         option proto 'static'
>         option ipaddr '127.0.0.1'
>         option netmask '255.0.0.0'
> 
> config globals 'globals'
>         option ula_prefix 'fdc4:e092:8929::/48'
> 
> config interface 'lan'
>         option type 'bridge'
>         option proto 'static'
>         option ipaddr '192.168.0.32'
>         option netmask '255.255.255.0'
>         option ip6assign '60'
>         option gateway '192.168.0.1'
>         list dns '8.8.8.8'
>         option ifname 'bat0 eth0'
> 
> config interface 'nwi_mesh0'
>         option mtu '2304'
>         option proto 'batadv_hardif'
>         option master 'bat0'
> 
> config interface 'bat0'
>         option proto 'batadv'
>         option routing_algo 'BATMAN_V'
>         option aggregated_ogms '1'
>         option ap_isolation '0'
>         option bonding '0'
>         option fragmentation '1'
>         option gw_mode 'server'
>         option log_level '0'
>         option orig_interval '1000'
>         option bridge_loop_avoidance '1'
>         option distributed_arp_table '1'
>         option multicast_mode '1'
>         option network_coding '0'
>         option hop_penalty '30'
>         option isolation_mark '0x00000000/0x00000000'
> 
> 
> And here is my /etc/config/wireless
> root@OpenWrt:/etc/config# cat wireless
> config wifi-device 'radio0'
>         option type 'mac80211'
>         option channel '36'
>         option hwmode '11a'
>         option path
> 'soc0/soc/1ffc000.pcie/pci0000:00/0000:00:00.0/0000:01:00.0' option htmode
> 'VHT80'
> 
> config wifi-iface 'mesh0'
>         option device 'radio0'
>         option ifname 'mesh0'
>         option network 'nwi_mesh0'
>         option mode 'mesh'
>         option mesh_fwding '0'
>         option mesh_id 'batman_mesh'
>         option encryption 'none'
> 
> config wifi-iface 'wifinet0'
>         option device 'radio0'
>         option mode 'ap'
>         option ssid 'N2-Lander'
>         option encryption 'psk2'
>         option key 'finnjamin'
>         option ifname 'wlan0'
>         option network 'lan'
> 
> 
> Any and all help is greatly appreciated


[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Batman_V Originator Loop Issue
  2020-07-08  9:15 ` Simon Wunderlich
@ 2020-07-08 14:44   ` lavincent15
  2020-07-08 15:26   ` lavincent15
  1 sibling, 0 replies; 10+ messages in thread
From: lavincent15 @ 2020-07-08 14:44 UTC (permalink / raw)
  To: b.a.t.m.a.n

"can you please describe which nodes are connected to the LAN and which are
not? You say "one is connected to LAN" and the others are "operating as an"

00:30:1a:4e:b8:26 is the only node that is connected via eth0 to my LAN. All three nodes are running mesh point and AP on mesh0 and wlan0 respectively.

"If multiple nodes are connected and bridged to the same LAN, bridge loop
avoidance should be enabled - you have that in your config, but you could
double check with "batctl bl" and then "batctl bbt"/"batctl
cl" (please post
these tables if you think this could be connected).
You could also try disabling distributed arp table and multicast mode, just to
make sure this is not shooting us in the foot here. Those optimizations are
not really needed for such a small network."

I see. Since I only have one node connected to LAN via eth0, I do not need bla. I just tried disabling multicast, bla, and arp table. So far so good! It's working. Thank you so much for the help!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Batman_V Originator Loop Issue
  2020-07-08  9:15 ` Simon Wunderlich
  2020-07-08 14:44   ` lavincent15
@ 2020-07-08 15:26   ` lavincent15
  2020-07-09 20:33     ` Linus Lüssing
  1 sibling, 1 reply; 10+ messages in thread
From: lavincent15 @ 2020-07-08 15:26 UTC (permalink / raw)
  To: b.a.t.m.a.n

Simon,

When I enable DAT on all of my nodes, the network breaks down. With DAT disabled on all the nodes, the network works fine.

As I develop my project, I would like to take advantage of the mesh wide ARP caching feature DAT. Is there any way I can fix things to where DAT will work on my network?

Thanks,
Luke

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Batman_V Originator Loop Issue
  2020-07-08 15:26   ` lavincent15
@ 2020-07-09 20:33     ` Linus Lüssing
  2020-07-24 10:02       ` Linus Lüssing
  0 siblings, 1 reply; 10+ messages in thread
From: Linus Lüssing @ 2020-07-09 20:33 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

Hi Luke,

On Wed, Jul 08, 2020 at 03:26:49PM -0000, lavincent15@gmail.com wrote:
> Simon,
> 
> When I enable DAT on all of my nodes, the network breaks down. With DAT disabled on all the nodes, the network works fine.
> 
> As I develop my project, I would like to take advantage of the mesh wide ARP caching feature DAT. Is there any way I can fix things to where DAT will work on my network?
> 
> Thanks,
> Luke

Would it be possible for you to try an older version of
batman-adv, like v2019.0? There were a few new feature additions
for DAT after that one.

Btw. did you also try disabling aggregation with your batman-adv
2020.1 version? That didn't make a difference, right?

(disabling aggregation for BATMAN_V in v2019.0 won't make a difference
as it wasn't implemented there yet, so if you could try that with
2020.1 would be great)

Regards, Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Batman_V Originator Loop Issue
  2020-07-09 20:33     ` Linus Lüssing
@ 2020-07-24 10:02       ` Linus Lüssing
  2020-07-24 15:00         ` lavincent15
  0 siblings, 1 reply; 10+ messages in thread
From: Linus Lüssing @ 2020-07-24 10:02 UTC (permalink / raw)
  To: lavincent15; +Cc: b.a.t.m.a.n

On Thu, Jul 09, 2020 at 10:33:44PM +0200, Linus Lüssing wrote:
> Hi Luke,
> 
> On Wed, Jul 08, 2020 at 03:26:49PM -0000, lavincent15@gmail.com wrote:
> > Simon,
> > 
> > When I enable DAT on all of my nodes, the network breaks down. With DAT disabled on all the nodes, the network works fine.
> > 
> > As I develop my project, I would like to take advantage of the mesh wide ARP caching feature DAT. Is there any way I can fix things to where DAT will work on my network?
> > 
> > Thanks,
> > Luke
> 
> Would it be possible for you to try an older version of
> batman-adv, like v2019.0? There were a few new feature additions
> for DAT after that one.
> 
> Btw. did you also try disabling aggregation with your batman-adv
> 2020.1 version? That didn't make a difference, right?
> 
> (disabling aggregation for BATMAN_V in v2019.0 won't make a difference
> as it wasn't implemented there yet, so if you could try that with
> 2020.1 would be great)
> 
> Regards, Linus

Hi Luke,

Any news, especially regarding aggregation?

Some likely bug regarding the aggregation was found, a
description + potential patch can be found here:

https://www.open-mesh.org/issues/413

Would be great if you could check if this is related to your issue
or not.

Regards, Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Batman_V Originator Loop Issue
  2020-07-24 10:02       ` Linus Lüssing
@ 2020-07-24 15:00         ` lavincent15
  2020-07-24 20:23           ` Linus Lüssing
  0 siblings, 1 reply; 10+ messages in thread
From: lavincent15 @ 2020-07-24 15:00 UTC (permalink / raw)
  To: b.a.t.m.a.n

Linus,

I have a working network with aggregated_ogms enabled and DAT disabled.I just tried disabling aggregated_ogms and the network continued to function properly. I then enabled DAT and the network continued to function properly. So it seems I just cannot have aggregated_ogms and DAT enabled at the same time.

Thanks,
Luke

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Batman_V Originator Loop Issue
  2020-07-24 15:00         ` lavincent15
@ 2020-07-24 20:23           ` Linus Lüssing
  2020-07-27 16:28             ` lavincent15
  0 siblings, 1 reply; 10+ messages in thread
From: Linus Lüssing @ 2020-07-24 20:23 UTC (permalink / raw)
  To: lavincent15, b.a.t.m.a.n

On Fri, Jul 24, 2020 at 03:00:33PM -0000, lavincent15@gmail.com wrote:
> Linus,
> 
> I have a working network with aggregated_ogms enabled and DAT disabled.I just tried disabling aggregated_ogms and the network continued to function properly. I then enabled DAT and the network continued to function properly. So it seems I just cannot have aggregated_ogms and DAT enabled at the same time.
> 
> Thanks,
> Luke


Hi Luke,

Awesome, great news, that intensifies the suspicion that the issue
in the aggregation code is the main cause.

Would it be possible for you to try the patch from the ticket and
if this allows you to enable both DAT and aggregation?

https://www.open-mesh.org/issues/413
=> https://git.open-mesh.org/batman-adv.git/commit/0115502eab54a80f2c05884efce6ee164ed3cd9f

Cheers, Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Batman_V Originator Loop Issue
  2020-07-24 20:23           ` Linus Lüssing
@ 2020-07-27 16:28             ` lavincent15
  2020-07-27 18:36               ` Linus Lüssing
  0 siblings, 1 reply; 10+ messages in thread
From: lavincent15 @ 2020-07-27 16:28 UTC (permalink / raw)
  To: b.a.t.m.a.n

Linus

I would love to try it out and help with the development, but unfortunately I do not have the time to do that. My internship is coming to a close, and I need to use a version I know works to provide good data.

Side note* Do you think you could provide me with a rough equation for when a node decides to use a hop instead of a direct connection? I'm particularly interested in how the nodes use the hop penalty in the equation. Does speeding up the interval increase the speediness of its decision? I'm using this in a mobile node environment and I need it to dynamically switch to the most stable connection.

Thanks,
Luke

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Batman_V Originator Loop Issue
  2020-07-27 16:28             ` lavincent15
@ 2020-07-27 18:36               ` Linus Lüssing
  0 siblings, 0 replies; 10+ messages in thread
From: Linus Lüssing @ 2020-07-27 18:36 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

On Mon, Jul 27, 2020 at 04:28:15PM -0000, lavincent15@gmail.com wrote:
> Linus
> 
> I would love to try it out and help with the development, but unfortunately I do not have the time to do that. My internship is coming to a close, and I need to use a version I know works to provide good data.

Oh, okay, good luck with the results then!

> Side note* Do you think you could provide me with a rough equation for when a node decides to use a hop instead of a direct connection? I'm particularly interested in how the nodes use the hop penalty in the equation. Does speeding up the interval increase the speediness of its decision? I'm using this in a mobile node environment and I need it to dynamically switch to the most stable connection.

The metric, including the hop-penalty for BATMAN V is described here:

https://www.open-mesh.org/projects/batman-adv/wiki/Ogmv2#322-Metric-Update

Or here in the code, in this short function:

https://elixir.bootlin.com/linux/v5.7.8/source/net/batman-adv/bat_v_ogm.c#L470

And then the algorithm will compare if the resulting throughput
metric is higher via a direct connection or over another hop even
with the (either) hop or half-duplex penalty applied.

Hope that helps.

Regards, Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-07-27 18:36 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-07 19:47 Batman_V Originator Loop Issue lavincent15
2020-07-08  9:15 ` Simon Wunderlich
2020-07-08 14:44   ` lavincent15
2020-07-08 15:26   ` lavincent15
2020-07-09 20:33     ` Linus Lüssing
2020-07-24 10:02       ` Linus Lüssing
2020-07-24 15:00         ` lavincent15
2020-07-24 20:23           ` Linus Lüssing
2020-07-27 16:28             ` lavincent15
2020-07-27 18:36               ` Linus Lüssing

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.