b.a.t.m.a.n.lists.open-mesh.org archive mirror
 help / color / mirror / Atom feed
* [B.A.T.M.A.N.] Looping unicast packets when using BLA
@ 2016-02-08 11:35 Andreas Pape
  2016-02-08 12:29 ` Simon Wunderlich
       [not found] ` <4917381.eeOl7B1qNb-2016-02-09-07-20-04@prime>
  0 siblings, 2 replies; 9+ messages in thread
From: Andreas Pape @ 2016-02-08 11:35 UTC (permalink / raw)
  To: b.a.t.m.a.n

Hello

I have a problem in my mesh setup which is quite similiar to Bug#216 of
the bug tracker.
I'm using batman-adv 2014.4.0 in a BLA setup consisting of 3 Mesh Nodes
(A, B, C) connected to the same backone network via a common switch and a
mesh node D connected to an end device E. I ping that single mesh node D
and the connected end device E from a PC which is connected to the same
switch as the three Nodes A to C. BLA is compiled and enabled.

From time to time I see looping unicast packets in my backbone network.
This unicast looping starts directly after one of the nodes A to C claimed
the mac address of my PC. The looping telegram is then the ping request
sent by the PC. I have a wireshark recording made in my backbone via port
mirroring of one of the switch ports where a mesh node is connected to
which shows this behaviour.

I am not sure if I understood bla correctly but isn't it nonsense that a
bla backbone gateway claims MAC addresses from its own backbone (i.e. the
one it is directly connected to via its ethernet port)?

A simple change in batadv_bla_rx seems to solve this problem: add an
additional check before claiming a new mac address: if this address is
already known from the tt local table (via command batadv_is_my_client)
don't claim it.

This seems to solve my problem as far as I have tested so far. Any
thoughts about that?

Best regards,
Andreas



..................................................................
PHOENIX CONTACT ELECTRONICS GmbH

Sitz der Gesellschaft / registered office of the company: 31812 Bad Pyrmont
USt-Id-Nr.: DE811742156
Amtsgericht Hannover HRB 100528 / district court Hannover HRB 100528
Geschäftsführer / Executive Board: Roland Bent, Dr. Martin Heubeck
___________________________________________________________________
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren, jegliche anderweitige Verwendung sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet.
----------------------------------------------------------------------------------------------------
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure, distribution or other use of the material or parts thereof is strictly forbidden.
___________________________________________________________________

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [B.A.T.M.A.N.] Looping unicast packets when using BLA
  2016-02-08 11:35 [B.A.T.M.A.N.] Looping unicast packets when using BLA Andreas Pape
@ 2016-02-08 12:29 ` Simon Wunderlich
       [not found] ` <4917381.eeOl7B1qNb-2016-02-09-07-20-04@prime>
  1 sibling, 0 replies; 9+ messages in thread
From: Simon Wunderlich @ 2016-02-08 12:29 UTC (permalink / raw)
  To: b.a.t.m.a.n

[-- Attachment #1: Type: text/plain, Size: 2489 bytes --]

Hi Andreas,

On Monday 08 February 2016 12:35:35 Andreas Pape wrote:
> Hello
> 
> I have a problem in my mesh setup which is quite similiar to Bug#216 of
> the bug tracker.
> I'm using batman-adv 2014.4.0 in a BLA setup consisting of 3 Mesh Nodes
> (A, B, C) connected to the same backone network via a common switch and a
> mesh node D connected to an end device E. I ping that single mesh node D
> and the connected end device E from a PC which is connected to the same
> switch as the three Nodes A to C. BLA is compiled and enabled.

First of all, did you test v2016.0? v2014.4.0 is pretty old, the bug was 
created and closed in 2015 after all ...

> 
> From time to time I see looping unicast packets in my backbone network.
> This unicast looping starts directly after one of the nodes A to C claimed
> the mac address of my PC. The looping telegram is then the ping request
> sent by the PC. I have a wireshark recording made in my backbone via port
> mirroring of one of the switch ports where a mesh node is connected to
> which shows this behaviour.

Is it really the ping packet looping? If yes, which nodes are part of the 
loop? Normally we only see broadcast packets looping. In #216 it was also 
broadcast packets where we have seen duplicates, and this was more a locking 
problem leading to creation of the same packets again and again.

> 
> I am not sure if I understood bla correctly but isn't it nonsense that a
> bla backbone gateway claims MAC addresses from its own backbone (i.e. the
> one it is directly connected to via its ethernet port)?

Yes, that appears to be nonsense indeed. Do you happen to have DAT enabled? 
There were also some problems with that which are fixed by now.

> 
> A simple change in batadv_bla_rx seems to solve this problem: add an
> additional check before claiming a new mac address: if this address is
> already known from the tt local table (via command batadv_is_my_client)
> don't claim it.
> 
> This seems to solve my problem as far as I have tested so far. Any
> thoughts about that?

This will prevent roaming from on of your nodes connected to the backbone (A-
C) to the mesh-only node D.

I would like to suggest to upgrade and test again, and try disabling DAT if 
the problem is still present (you should still report it if DAT makes a 
difference in that case). If you still see a problem then, we probably have 
something unsolved, and then I'd like to understand which nodes are part of 
the loop.

Thank you!
     Simon

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [B.A.T.M.A.N.] Antwort: Re: Looping unicast packets when using BLA
       [not found] ` <4917381.eeOl7B1qNb-2016-02-09-07-20-04@prime>
@ 2016-02-09  7:01   ` Andreas Pape
  2016-02-11  9:19     ` Andreas Pape
  0 siblings, 1 reply; 9+ messages in thread
From: Andreas Pape @ 2016-02-09  7:01 UTC (permalink / raw)
  To: Simon Wunderlich; +Cc: b.a.t.m.a.n

Hi Simon,

thanks for the quick reply.

Simon Wunderlich <sw@simonwunderlich.de> schrieb am 08.02.2016 13:29:55:

> Von: Simon Wunderlich <sw@simonwunderlich.de>
> An: b.a.t.m.a.n@lists.open-mesh.org
> Kopie: Andreas Pape <APape@phoenixcontact.com>
> Datum: 09.02.2016 07:20
> Betreff: Re: [B.A.T.M.A.N.] Looping unicast packets when using BLA
>
> Hi Andreas,
>
> On Monday 08 February 2016 12:35:35 Andreas Pape wrote:
> > Hello
> >
> > I have a problem in my mesh setup which is quite similiar to Bug#216
of
> > the bug tracker.
> > I'm using batman-adv 2014.4.0 in a BLA setup consisting of 3 Mesh
Nodes
> > (A, B, C) connected to the same backone network via a common switch
and a
> > mesh node D connected to an end device E. I ping that single mesh node
D
> > and the connected end device E from a PC which is connected to the
same
> > switch as the three Nodes A to C. BLA is compiled and enabled.
>
> First of all, did you test v2016.0? v2014.4.0 is pretty old, the bug was

> created and closed in 2015 after all ...

I just restarted my last year's work to test batman-adv and was a little
bit lazy to update to the latests version as my devices use a fairly old
kernel version 2.6.32. And the update to 2014.4.0 early last year only
worked with Marek's help (issue in the compat code).

But before making further assumptions, I'll start with the update first.
In the meantime I am pretty sure, that the problem does not come from the
bla code as such. I changed the code in batadv_bla_rx in the repsective
part as follows:

        ether_addr_copy(search_claim.addr, ethhdr->h_source);
        search_claim.vid = vid;
        claim = batadv_claim_hash_find(bat_priv, &search_claim);

        if (!claim) {
                /* possible optimization: race for a claim */
                /* No claim exists yet, claim it for us!
                 */

                if (!batadv_is_my_client(bat_priv, ethhdr->h_source, vid))
{
                        batadv_handle_claim(bat_priv, primary_if,
                                        primary_if->net_dev->dev_addr,
                                        ethhdr->h_source, vid);
                        goto allow;
                } else {
                        printk("not claimed: %pM \n", ethhdr->h_source);
                        goto handled;
                }
        }

I did this yesterday in a "quick-and-dirty" way and restarted my pingtest,
which ran until this morning without looping packets. But I did not notice
until now that I did not only prevent the claiming of MAC addresses from
the own backbone but I also dropped the packets causing the claim to be
triggered! That tells me that the original code in batadv_bla_rx is most
likely OK and that my problem comes from somewhere else (e.g. ping request
from PC to device E enters gateway A and is forwarded to gateway B via the
mesh. But gateway B does not forward it to mesh node D but sends the
packet via the linux bridge and my eth0 interface to the backbone
network).

But before digging deeper into this, I'll make a try with 2016.0 and see
if the problem is solved there.

>
> >
> > From time to time I see looping unicast packets in my backbone
network.
> > This unicast looping starts directly after one of the nodes A to C
claimed
> > the mac address of my PC. The looping telegram is then the ping
request
> > sent by the PC. I have a wireshark recording made in my backbone via
port
> > mirroring of one of the switch ports where a mesh node is connected to
> > which shows this behaviour.
>
> Is it really the ping packet looping? If yes, which nodes are part of
the
> loop? Normally we only see broadcast packets looping. In #216 it was
also
> broadcast packets where we have seen duplicates, and this was more a
locking
> problem leading to creation of the same packets again and again.
>
> >
> > I am not sure if I understood bla correctly but isn't it nonsense that
a
> > bla backbone gateway claims MAC addresses from its own backbone (i.e.
the
> > one it is directly connected to via its ethernet port)?
>
> Yes, that appears to be nonsense indeed. Do you happen to have DAT
enabled?
> There were also some problems with that which are fixed by now.

DAT is enabled. But my problem starts with a gratuitous arp containing a
claim and not a multiplication of normal arp requests or repsonses.

>
> >
> > A simple change in batadv_bla_rx seems to solve this problem: add an
> > additional check before claiming a new mac address: if this address is
> > already known from the tt local table (via command
batadv_is_my_client)
> > don't claim it.
> >
> > This seems to solve my problem as far as I have tested so far. Any
> > thoughts about that?
>
> This will prevent roaming from on of your nodes connected to the
backbone (A-
> C) to the mesh-only node D.
>
> I would like to suggest to upgrade and test again, and try disabling DAT
if
> the problem is still present (you should still report it if DAT makes a
> difference in that case). If you still see a problem then, we probably
have
> something unsolved, and then I'd like to understand which nodes are part
of
> the loop.
>
> Thank you!
>      Simon[Anhang "signature.asc" gelöscht von Andreas Pape/Phoenix
Contact]

Thanks and regards,
Andreas



..................................................................
PHOENIX CONTACT ELECTRONICS GmbH

Sitz der Gesellschaft / registered office of the company: 31812 Bad Pyrmont
USt-Id-Nr.: DE811742156
Amtsgericht Hannover HRB 100528 / district court Hannover HRB 100528
Geschäftsführer / Executive Board: Roland Bent, Dr. Martin Heubeck
___________________________________________________________________
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren, jegliche anderweitige Verwendung sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet.
----------------------------------------------------------------------------------------------------
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure, distribution or other use of the material or parts thereof is strictly forbidden.
___________________________________________________________________

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [B.A.T.M.A.N.] Antwort: Re: Looping unicast packets when using BLA
  2016-02-09  7:01   ` [B.A.T.M.A.N.] Antwort: " Andreas Pape
@ 2016-02-11  9:19     ` Andreas Pape
  2016-02-11 11:08       ` Simon Wunderlich
  0 siblings, 1 reply; 9+ messages in thread
From: Andreas Pape @ 2016-02-11  9:19 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking; +Cc: B.A.T.M.A.N

Hi,

I want to give a short feedback concerning my attempt to use
batman-adv-2016.0 instead of to older version I used first.

Unfortunately, using batman-adv-2016.0 does not solve my problem. I can
still see claim frames sent by the mesh gateways into the common backbone
network for MAC addresses out of the common backbone network itself. I
enabled again both bla and dat support. Furthermore I have also still
problems with DAT, because there are multiple ARP replies visible in the
backbone network coming out of the mesh.

That reminds me that I have forgotten to mention in my earlier mail, that
I did not test with the original batman-adv-2014.4.0 first but with a
version having the patch applied I sent to the mailing list in March last
year concerning possible fixes for dat in bla setups. If I remember it
correctly I think there were two main issues in batman-adv-2014.4.0 when
using dat in combination with bla:
1.Broadcast ARP requests from the backbone network are handled by each
gateway, leading to multiple dat adress resoultions in parallel.
2. As dat uses tunneling of broadcasts in special batman-adv unicast
frames, the current bla code does not seem to prevent these broadcasts
from reaching the backone network as it is done for normal broadcast
coming from the mesh and heading for the backbone.
Both effects together lead to a multiplication of arp requests and
replies. My patch of last year tried to address this.

Good news is that disabling dat in batman-adv-2016.0 seems to solve my
observed issues (in strange ways even the observed erroneous claim frames
in the backbone network....). But I think dat is a clever feature to
reduce broadcast load in the mesh network. Wouldn't it be useful to dig a
little bit deeper into the combined use of dat and bla? I would volunteer
for testing and providing ideas for improving the behaviour.

Or do you think that I have an issue with my old 2.6.32 kernel?

Best regards,
Andreas



"B.A.T.M.A.N" <b.a.t.m.a.n-bounces@lists.open-mesh.org> schrieb am
09.02.2016 08:01:27:

> Von: Andreas Pape <APape@phoenixcontact.com>
> An: Simon Wunderlich <sw@simonwunderlich.de>
> Kopie: b.a.t.m.a.n@lists.open-mesh.org
> Datum: 09.02.2016 08:01
> Betreff: [B.A.T.M.A.N.] Antwort: Re: Looping unicast packets when using
BLA
> Gesendet von: "B.A.T.M.A.N" <b.a.t.m.a.n-bounces@lists.open-mesh.org>
>
> Hi Simon,
>
> thanks for the quick reply.
>
> Simon Wunderlich <sw@simonwunderlich.de> schrieb am 08.02.2016 13:29:55:
>
> > Von: Simon Wunderlich <sw@simonwunderlich.de>
> > An: b.a.t.m.a.n@lists.open-mesh.org
> > Kopie: Andreas Pape <APape@phoenixcontact.com>
> > Datum: 09.02.2016 07:20
> > Betreff: Re: [B.A.T.M.A.N.] Looping unicast packets when using BLA
> >
> > Hi Andreas,
> >
> > On Monday 08 February 2016 12:35:35 Andreas Pape wrote:
> > > Hello
> > >
> > > I have a problem in my mesh setup which is quite similiar to Bug#216
> of
> > > the bug tracker.
> > > I'm using batman-adv 2014.4.0 in a BLA setup consisting of 3 Mesh
> Nodes
> > > (A, B, C) connected to the same backone network via a common switch
> and a
> > > mesh node D connected to an end device E. I ping that single mesh
node
> D
> > > and the connected end device E from a PC which is connected to the
> same
> > > switch as the three Nodes A to C. BLA is compiled and enabled.
> >
> > First of all, did you test v2016.0? v2014.4.0 is pretty old, the bug
was
>
> > created and closed in 2015 after all ...
>
> I just restarted my last year's work to test batman-adv and was a little
> bit lazy to update to the latests version as my devices use a fairly old
> kernel version 2.6.32. And the update to 2014.4.0 early last year only
> worked with Marek's help (issue in the compat code).
>
> But before making further assumptions, I'll start with the update first.
> In the meantime I am pretty sure, that the problem does not come from
the
> bla code as such. I changed the code in batadv_bla_rx in the repsective
> part as follows:
>
>         ether_addr_copy(search_claim.addr, ethhdr->h_source);
>         search_claim.vid = vid;
>         claim = batadv_claim_hash_find(bat_priv, &search_claim);
>
>         if (!claim) {
>                 /* possible optimization: race for a claim */
>                 /* No claim exists yet, claim it for us!
>                  */
>
>                 if (!batadv_is_my_client(bat_priv, ethhdr->h_source,
vid))
> {
>                         batadv_handle_claim(bat_priv, primary_if,
>                                         primary_if->net_dev->dev_addr,
>                                         ethhdr->h_source, vid);
>                         goto allow;
>                 } else {
>                         printk("not claimed: %pM \n", ethhdr->h_source);
>                         goto handled;
>                 }
>         }
>
> I did this yesterday in a "quick-and-dirty" way and restarted my
pingtest,
> which ran until this morning without looping packets. But I did not
notice
> until now that I did not only prevent the claiming of MAC addresses from
> the own backbone but I also dropped the packets causing the claim to be
> triggered! That tells me that the original code in batadv_bla_rx is most
> likely OK and that my problem comes from somewhere else (e.g. ping
request
> from PC to device E enters gateway A and is forwarded to gateway B via
the
> mesh. But gateway B does not forward it to mesh node D but sends the
> packet via the linux bridge and my eth0 interface to the backbone
> network).
>
> But before digging deeper into this, I'll make a try with 2016.0 and see
> if the problem is solved there.
>
> >
> > >
> > > From time to time I see looping unicast packets in my backbone
> network.
> > > This unicast looping starts directly after one of the nodes A to C
> claimed
> > > the mac address of my PC. The looping telegram is then the ping
> request
> > > sent by the PC. I have a wireshark recording made in my backbone via
> port
> > > mirroring of one of the switch ports where a mesh node is connected
to
> > > which shows this behaviour.
> >
> > Is it really the ping packet looping? If yes, which nodes are part of
> the
> > loop? Normally we only see broadcast packets looping. In #216 it was
> also
> > broadcast packets where we have seen duplicates, and this was more a
> locking
> > problem leading to creation of the same packets again and again.
> >
> > >
> > > I am not sure if I understood bla correctly but isn't it nonsense
that
> a
> > > bla backbone gateway claims MAC addresses from its own backbone
(i.e.
> the
> > > one it is directly connected to via its ethernet port)?
> >
> > Yes, that appears to be nonsense indeed. Do you happen to have DAT
> enabled?
> > There were also some problems with that which are fixed by now.
>
> DAT is enabled. But my problem starts with a gratuitous arp containing a
> claim and not a multiplication of normal arp requests or repsonses.
>
> >
> > >
> > > A simple change in batadv_bla_rx seems to solve this problem: add an
> > > additional check before claiming a new mac address: if this address
is
> > > already known from the tt local table (via command
> batadv_is_my_client)
> > > don't claim it.
> > >
> > > This seems to solve my problem as far as I have tested so far. Any
> > > thoughts about that?
> >
> > This will prevent roaming from on of your nodes connected to the
> backbone (A-
> > C) to the mesh-only node D.
> >
> > I would like to suggest to upgrade and test again, and try disabling
DAT
> if
> > the problem is still present (you should still report it if DAT makes
a
> > difference in that case). If you still see a problem then, we probably
> have
> > something unsolved, and then I'd like to understand which nodes are
part
> of
> > the loop.
> >
> > Thank you!
> >      Simon[Anhang "signature.asc" gelöscht von Andreas Pape/Phoenix
> Contact]
>
> Thanks and regards,
> Andreas
>
>
>
> ..................................................................
> PHOENIX CONTACT ELECTRONICS GmbH
>
> Sitz der Gesellschaft / registered office of the company: 31812 Bad
Pyrmont
> USt-Id-Nr.: DE811742156
> Amtsgericht Hannover HRB 100528 / district court Hannover HRB 100528
> Geschäftsführer / Executive Board: Roland Bent, Dr. Martin Heubeck
> ___________________________________________________________________
> Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
> Informationen. Wenn Sie nicht der richtige Adressat sind oder diese
> E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den
> Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren,
> jegliche anderweitige Verwendung sowie die unbefugte Weitergabe
> dieser Mail ist nicht gestattet.
>
----------------------------------------------------------------------------------------------------
> This e-mail may contain confidential and/or privileged information.
> If you are not the intended recipient (or have received this e-mail
> in error) please notify the sender immediately and destroy this e-
> mail. Any unauthorized copying, disclosure, distribution or other
> use of the material or parts thereof is strictly forbidden.
> ___________________________________________________________________



..................................................................
PHOENIX CONTACT ELECTRONICS GmbH

Sitz der Gesellschaft / registered office of the company: 31812 Bad Pyrmont
USt-Id-Nr.: DE811742156
Amtsgericht Hannover HRB 100528 / district court Hannover HRB 100528
Geschäftsführer / Executive Board: Roland Bent, Dr. Martin Heubeck
___________________________________________________________________
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren, jegliche anderweitige Verwendung sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet.
----------------------------------------------------------------------------------------------------
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure, distribution or other use of the material or parts thereof is strictly forbidden.
___________________________________________________________________

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [B.A.T.M.A.N.] Antwort: Re: Looping unicast packets when using BLA
  2016-02-11  9:19     ` Andreas Pape
@ 2016-02-11 11:08       ` Simon Wunderlich
  2016-02-12 10:40         ` [B.A.T.M.A.N.] Antwort: Re: " Andreas Pape
  0 siblings, 1 reply; 9+ messages in thread
From: Simon Wunderlich @ 2016-02-11 11:08 UTC (permalink / raw)
  To: Andreas Pape
  Cc: The list for a Better Approach To Mobile Ad-hoc Networking, B.A.T.M.A.N

[-- Attachment #1: Type: text/plain, Size: 2998 bytes --]

Hi Andreas,

On Thursday 11 February 2016 10:19:07 Andreas Pape wrote:
> Hi,
> 
> I want to give a short feedback concerning my attempt to use
> batman-adv-2016.0 instead of to older version I used first.
> 
> Unfortunately, using batman-adv-2016.0 does not solve my problem. I can
> still see claim frames sent by the mesh gateways into the common backbone
> network for MAC addresses out of the common backbone network itself. I
> enabled again both bla and dat support. Furthermore I have also still
> problems with DAT, because there are multiple ARP replies visible in the
> backbone network coming out of the mesh.
> 
> That reminds me that I have forgotten to mention in my earlier mail, that
> I did not test with the original batman-adv-2014.4.0 first but with a
> version having the patch applied I sent to the mailing list in March last
> year concerning possible fixes for dat in bla setups. If I remember it
> correctly I think there were two main issues in batman-adv-2014.4.0 when
> using dat in combination with bla:
> 1.Broadcast ARP requests from the backbone network are handled by each
> gateway, leading to multiple dat adress resoultions in parallel.

That shouldn't be a problem on its own.

> 2. As dat uses tunneling of broadcasts in special batman-adv unicast
> frames, the current bla code does not seem to prevent these broadcasts
> from reaching the backone network as it is done for normal broadcast
> coming from the mesh and heading for the backbone.
> Both effects together lead to a multiplication of arp requests and
> replies. My patch of last year tried to address this.

Hm, I see. I just checked the code and it seems we fixed this issue for speedy 
join in the mean time (affecting TT), but for bla the problem is still 
present.

Wouldn't it be sufficient to add something like a check for backbones ( 
batadv_bla_is_backbone_gw) into batadv_recv_unicast_packet() and drop packets 
if they came from the same backbone?

I have found your patch from last year. Would you like to rebase/split your 
patch to address the remaining issues? That would help us a lot. Please also 
put a proper patch message. I promise to be more responsive this time. :)

> 
> Good news is that disabling dat in batman-adv-2016.0 seems to solve my
> observed issues (in strange ways even the observed erroneous claim frames
> in the backbone network....). But I think dat is a clever feature to
> reduce broadcast load in the mesh network. Wouldn't it be useful to dig a
> little bit deeper into the combined use of dat and bla? I would volunteer
> for testing and providing ideas for improving the behaviour.

If you could help, that would be great! Your patch from last year already is a 
good start, so I'm sure you are capable of working on that. I would be happy 
to help with that, too.

> 
> Or do you think that I have an issue with my old 2.6.32 kernel?

I don't think so. To me it looks like the issue could still be present in 
current versions ...

Cheers,
     Simon

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [B.A.T.M.A.N.] Antwort: Re: Re: Antwort: Re: Looping unicast packets when using BLA
  2016-02-11 11:08       ` Simon Wunderlich
@ 2016-02-12 10:40         ` Andreas Pape
  2016-02-12 13:04           ` Simon Wunderlich
  0 siblings, 1 reply; 9+ messages in thread
From: Andreas Pape @ 2016-02-12 10:40 UTC (permalink / raw)
  To: Simon Wunderlich
  Cc: The list for a Better Approach To Mobile Ad-hoc Networking, B.A.T.M.A.N

Hi Simon,


Simon Wunderlich <sw@simonwunderlich.de> schrieb am 11.02.2016 12:08:25:

> Von: Simon Wunderlich <sw@simonwunderlich.de>
> An: Andreas Pape <APape@phoenixcontact.com>
> Kopie: The list for a Better Approach To Mobile Ad-hoc Networking
> <b.a.t.m.a.n@lists.open-mesh.org>, "B.A.T.M.A.N" <b.a.t.m.a.n-
> bounces@lists.open-mesh.org>
> Datum: 11.02.2016 12:08
> Betreff: Re: Re: [B.A.T.M.A.N.] Antwort: Re: Looping unicast packets
> when using BLA
>
> Hi Andreas,
>
> On Thursday 11 February 2016 10:19:07 Andreas Pape wrote:
> > Hi,
> >
> > I want to give a short feedback concerning my attempt to use
> > batman-adv-2016.0 instead of to older version I used first.
> >
> > Unfortunately, using batman-adv-2016.0 does not solve my problem. I
can
> > still see claim frames sent by the mesh gateways into the common
backbone
> > network for MAC addresses out of the common backbone network itself. I
> > enabled again both bla and dat support. Furthermore I have also still
> > problems with DAT, because there are multiple ARP replies visible in
the
> > backbone network coming out of the mesh.
> >
> > That reminds me that I have forgotten to mention in my earlier mail,
that
> > I did not test with the original batman-adv-2014.4.0 first but with a
> > version having the patch applied I sent to the mailing list in March
last
> > year concerning possible fixes for dat in bla setups. If I remember it
> > correctly I think there were two main issues in batman-adv-2014.4.0
when
> > using dat in combination with bla:
> > 1.Broadcast ARP requests from the backbone network are handled by each
> > gateway, leading to multiple dat adress resoultions in parallel.
>
> That shouldn't be a problem on its own.

I think I wasn't precise enough concerning this point. I meant the effect,
that
a broadcast ARP coming from a common backbone reaches all gateways. If now
accidentally
several gateways can already answer that request due to dat, then the
current code sends
an arp reply from each gateway being able to answer. This broadcast does
not even reach
the mesh if all gateways can answer the request (as far as I have
understood the code).
Therefore broadcast handling in the mesh layer does not solve this
problem.
>
> > 2. As dat uses tunneling of broadcasts in special batman-adv unicast
> > frames, the current bla code does not seem to prevent these broadcasts
> > from reaching the backone network as it is done for normal broadcast
> > coming from the mesh and heading for the backbone.
> > Both effects together lead to a multiplication of arp requests and
> > replies. My patch of last year tried to address this.
>
> Hm, I see. I just checked the code and it seems we fixed this issue
> for speedy
> join in the mean time (affecting TT), but for bla the problem is still
> present.
>
> Wouldn't it be sufficient to add something like a check for backbones (
> batadv_bla_is_backbone_gw) into batadv_recv_unicast_packet() and drop
packets
> if they came from the same backbone?
>
That's a good question. This is something I did not dare to do last year
because
I cannot foresee possible negative implications. Perhaps someone more
experienced with
batman-adv routing should answer this. Therefore I focussed on fixing this
for the DAT ARP handling only.
But doing so as you propose would most likely have the positive side
effect that I will get rid
of the looping unicast packets in my backbone network, too. But as
mentioned in my prior mail
I think this looping unicast packets might have another cause. The
question is: why does
a gateway forward a unicast packet received via the mesh with a
destination mac behind
an originator somewhere else in the mesh (that originator is not connected
to the same backbone) to
its own backbone? If this only can happen if the entry in the global tt
expires (like a mac adress
table expires in a switch and the switch starts broadcasting incoming
packets to all ports), then
I would think blocking all unicast traffic coming from another backbone gw
of the same backbone network
is the smartest and easiest solution.

> I have found your patch from last year. Would you like to rebase/split
your
> patch to address the remaining issues? That would help us a lot. Please
also
> put a proper patch message. I promise to be more responsive this time.
:)
>

I'm trying to split my last year's patch into logical seperate pieces,
update them
to be compliant to the latest master branch of the batman-adv.git
repositor and will
mail them for further discussion.

> >
> > Good news is that disabling dat in batman-adv-2016.0 seems to solve my
> > observed issues (in strange ways even the observed erroneous claim
frames
> > in the backbone network....). But I think dat is a clever feature to
> > reduce broadcast load in the mesh network. Wouldn't it be useful to
dig a
> > little bit deeper into the combined use of dat and bla? I would
volunteer
> > for testing and providing ideas for improving the behaviour.
>
> If you could help, that would be great! Your patch from last year
> already is a
> good start, so I'm sure you are capable of working on that. I would be
happy
> to help with that, too.
>
> >
> > Or do you think that I have an issue with my old 2.6.32 kernel?
>
> I don't think so. To me it looks like the issue could still be present
in
> current versions ...
>
> Cheers,
>      Simon[Anhang "signature.asc" gelöscht von Andreas Pape/Phoenix
Contact]


..................................................................
PHOENIX CONTACT ELECTRONICS GmbH

Sitz der Gesellschaft / registered office of the company: 31812 Bad Pyrmont
USt-Id-Nr.: DE811742156
Amtsgericht Hannover HRB 100528 / district court Hannover HRB 100528
Geschäftsführer / Executive Board: Roland Bent, Dr. Martin Heubeck
___________________________________________________________________
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren, jegliche anderweitige Verwendung sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet.
----------------------------------------------------------------------------------------------------
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure, distribution or other use of the material or parts thereof is strictly forbidden.
___________________________________________________________________

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [B.A.T.M.A.N.] Antwort: Re: Re: Antwort: Re: Looping unicast packets when using BLA
  2016-02-12 10:40         ` [B.A.T.M.A.N.] Antwort: Re: " Andreas Pape
@ 2016-02-12 13:04           ` Simon Wunderlich
  2016-02-12 14:07             ` [B.A.T.M.A.N.] Antwort: " Andreas Pape
  0 siblings, 1 reply; 9+ messages in thread
From: Simon Wunderlich @ 2016-02-12 13:04 UTC (permalink / raw)
  To: Andreas Pape; +Cc: The list for a Better Approach To Mobile Ad-hoc Networking

[-- Attachment #1: Type: text/plain, Size: 4204 bytes --]

Hi Andreas,

On Friday 12 February 2016 11:40:21 Andreas Pape wrote:
> > > [...]
> > > using dat in combination with bla:
> > > 1.Broadcast ARP requests from the backbone network are handled by each
> > > gateway, leading to multiple dat adress resoultions in parallel.
> > 
> > That shouldn't be a problem on its own.
> 
> I think I wasn't precise enough concerning this point. I meant the effect,
> that
> a broadcast ARP coming from a common backbone reaches all gateways. If now
> accidentally
> several gateways can already answer that request due to dat, then the
> current code sends
> an arp reply from each gateway being able to answer. This broadcast does
> not even reach
> the mesh if all gateways can answer the request (as far as I have
> understood the code).
> Therefore broadcast handling in the mesh layer does not solve this
> problem.

Yes, we may have multiple gateways answering with an ARP reply. But how is 
this a problem? It is redundant, yes, but its just a unicast sent back. I 
don't see this a s problem yet ...

> 
> > > 2. As dat uses tunneling of broadcasts in special batman-adv unicast
> > > frames, the current bla code does not seem to prevent these broadcasts
> > > from reaching the backone network as it is done for normal broadcast
> > > coming from the mesh and heading for the backbone.
> > > Both effects together lead to a multiplication of arp requests and
> > > replies. My patch of last year tried to address this.
> > 
> > Hm, I see. I just checked the code and it seems we fixed this issue
> > for speedy
> > join in the mean time (affecting TT), but for bla the problem is still
> > present.
> > 
> > Wouldn't it be sufficient to add something like a check for backbones (
> > batadv_bla_is_backbone_gw) into batadv_recv_unicast_packet() and drop
> 
> packets
> 
> > if they came from the same backbone?
> 
> That's a good question. This is something I did not dare to do last year
> because
> I cannot foresee possible negative implications. Perhaps someone more
> experienced with
> batman-adv routing should answer this. Therefore I focussed on fixing this
> for the DAT ARP handling only.
> But doing so as you propose would most likely have the positive side
> effect that I will get rid
> of the looping unicast packets in my backbone network, too. But as
> mentioned in my prior mail
> I think this looping unicast packets might have another cause. The
> question is: why does
> a gateway forward a unicast packet received via the mesh with a
> destination mac behind
> an originator somewhere else in the mesh (that originator is not connected
> to the same backbone) to
> its own backbone? If this only can happen if the entry in the global tt
> expires (like a mac adress
> table expires in a switch and the switch starts broadcasting incoming
> packets to all ports), then
> I would think blocking all unicast traffic coming from another backbone gw
> of the same backbone network
> is the smartest and easiest solution.

Yeah, agreed. There shouldn't be any unicast messages to be sent among 
gateways through the mesh. We probably should not only avoid the receiving (as 
I suggested only), but also sending DAT requests to other gateways on the same 
backbone. The check should be similar and simple ... After all, if there is 
another gateway capable of answering, it will also receive the request and 
doesn't need it passed from somebody else on the same backbone.

Regarding the TT question/expiration, as a receiver we don't check the 
destination address and just accept the packet for receiption. But as I said, 
packets among gateways on the same backbone shouldn't be sent or received.

> 
> > I have found your patch from last year. Would you like to rebase/split
> 
> your
> 
> > patch to address the remaining issues? That would help us a lot. Please
> 
> also
> 
> > put a proper patch message. I promise to be more responsive this time.
> :
> :)
> 
> I'm trying to split my last year's patch into logical seperate pieces,
> update them
> to be compliant to the latest master branch of the batman-adv.git
> repositor and will
> mail them for further discussion.

Cool, thanks! I'm looking forward to it. :)

Cheers,
     Simon

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [B.A.T.M.A.N.] Antwort: Re: Antwort: Re: Re: Antwort: Re: Looping unicast packets when using BLA
  2016-02-12 13:04           ` Simon Wunderlich
@ 2016-02-12 14:07             ` Andreas Pape
  2016-02-15  8:54               ` Simon Wunderlich
  0 siblings, 1 reply; 9+ messages in thread
From: Andreas Pape @ 2016-02-12 14:07 UTC (permalink / raw)
  To: Simon Wunderlich
  Cc: The list for a Better Approach To Mobile Ad-hoc Networking

Simon Wunderlich <sw@simonwunderlich.de> schrieb am 12.02.2016 14:04:23:

> Von: Simon Wunderlich <sw@simonwunderlich.de>
> An: Andreas Pape <APape@phoenixcontact.com>
> Kopie: The list for a Better Approach To Mobile Ad-hoc Networking
> <b.a.t.m.a.n@lists.open-mesh.org>
> Datum: 12.02.2016 14:04
> Betreff: Re: Antwort: Re: Re: [B.A.T.M.A.N.] Antwort: Re: Looping
> unicast packets when using BLA
>
> Hi Andreas,
>
> On Friday 12 February 2016 11:40:21 Andreas Pape wrote:
> > > > [...]
> > > > using dat in combination with bla:
> > > > 1.Broadcast ARP requests from the backbone network are handled by
each
> > > > gateway, leading to multiple dat adress resoultions in parallel.
> > >
> > > That shouldn't be a problem on its own.
> >
> > I think I wasn't precise enough concerning this point. I meant the
effect,
> > that
> > a broadcast ARP coming from a common backbone reaches all gateways. If
now
> > accidentally
> > several gateways can already answer that request due to dat, then the
> > current code sends
> > an arp reply from each gateway being able to answer. This broadcast
does
> > not even reach
> > the mesh if all gateways can answer the request (as far as I have
> > understood the code).
> > Therefore broadcast handling in the mesh layer does not solve this
> > problem.
>
> Yes, we may have multiple gateways answering with an ARP reply. But how
is
> this a problem? It is redundant, yes, but its just a unicast sent back.
I
> don't see this a s problem yet ...
>

I would like to prevent duplicated packets as much as possible, even if
they are unicast packets normally harmlexs for typical PC hardware. But I
know of enough small embedded devices (sensors and stuff like that) which
don't like that.....

> >
> > > > 2. As dat uses tunneling of broadcasts in special batman-adv
unicast
> > > > frames, the current bla code does not seem to prevent these
broadcasts
> > > > from reaching the backone network as it is done for normal
broadcast
> > > > coming from the mesh and heading for the backbone.
> > > > Both effects together lead to a multiplication of arp requests and
> > > > replies. My patch of last year tried to address this.
> > >
> > > Hm, I see. I just checked the code and it seems we fixed this issue
> > > for speedy
> > > join in the mean time (affecting TT), but for bla the problem is
still
> > > present.
> > >
> > > Wouldn't it be sufficient to add something like a check for
backbones (
> > > batadv_bla_is_backbone_gw) into batadv_recv_unicast_packet() and
drop
> >
> > packets
> >
> > > if they came from the same backbone?
> >
> > That's a good question. This is something I did not dare to do last
year
> > because
> > I cannot foresee possible negative implications. Perhaps someone more
> > experienced with
> > batman-adv routing should answer this. Therefore I focussed on fixing
this
> > for the DAT ARP handling only.
> > But doing so as you propose would most likely have the positive side
> > effect that I will get rid
> > of the looping unicast packets in my backbone network, too. But as
> > mentioned in my prior mail
> > I think this looping unicast packets might have another cause. The
> > question is: why does
> > a gateway forward a unicast packet received via the mesh with a
> > destination mac behind
> > an originator somewhere else in the mesh (that originator is not
connected
> > to the same backbone) to
> > its own backbone? If this only can happen if the entry in the global
tt
> > expires (like a mac adress
> > table expires in a switch and the switch starts broadcasting incoming
> > packets to all ports), then
> > I would think blocking all unicast traffic coming from another
backbone gw
> > of the same backbone network
> > is the smartest and easiest solution.
>
> Yeah, agreed. There shouldn't be any unicast messages to be sent among
> gateways through the mesh. We probably should not only avoid the
> receiving (as
> I suggested only), but also sending DAT requests to other gateways
> on the same
> backbone. The check should be similar and simple ... After all, if there
is
> another gateway capable of answering, it will also receive the request
and
> doesn't need it passed from somebody else on the same backbone.
>
> Regarding the TT question/expiration, as a receiver we don't check the
> destination address and just accept the packet for receiption. But as I
said,
> packets among gateways on the same backbone shouldn't be sent or
received.
>
> >
> > > I have found your patch from last year. Would you like to
rebase/split
> >
> > your
> >
> > > patch to address the remaining issues? That would help us a lot.
Please
> >
> > also
> >
> > > put a proper patch message. I promise to be more responsive this
time.
> > :
> > :)
> >
> > I'm trying to split my last year's patch into logical seperate pieces,
> > update them
> > to be compliant to the latest master branch of the batman-adv.git
> > repositor and will
> > mail them for further discussion.
>
> Cool, thanks! I'm looking forward to it. :)

I've just sent the patches. They have the state of my "experiments" last
year. That means that your latest proposal is not integrated yet.
I quickly updated my devices in my test setup and it looks good (no
looping arp requests or multiple replies seen so far).

Regards,
Andreas



..................................................................
PHOENIX CONTACT ELECTRONICS GmbH

Sitz der Gesellschaft / registered office of the company: 31812 Bad Pyrmont
USt-Id-Nr.: DE811742156
Amtsgericht Hannover HRB 100528 / district court Hannover HRB 100528
Geschäftsführer / Executive Board: Roland Bent, Dr. Martin Heubeck
___________________________________________________________________
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren, jegliche anderweitige Verwendung sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet.
----------------------------------------------------------------------------------------------------
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure, distribution or other use of the material or parts thereof is strictly forbidden.
___________________________________________________________________

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [B.A.T.M.A.N.] Antwort: Re: Antwort: Re: Re: Antwort: Re: Looping unicast packets when using BLA
  2016-02-12 14:07             ` [B.A.T.M.A.N.] Antwort: " Andreas Pape
@ 2016-02-15  8:54               ` Simon Wunderlich
  0 siblings, 0 replies; 9+ messages in thread
From: Simon Wunderlich @ 2016-02-15  8:54 UTC (permalink / raw)
  To: Andreas Pape; +Cc: The list for a Better Approach To Mobile Ad-hoc Networking

[-- Attachment #1: Type: text/plain, Size: 2767 bytes --]

Hi Andreas,

On Friday 12 February 2016 15:07:59 Andreas Pape wrote:
> Simon Wunderlich <sw@simonwunderlich.de> schrieb am 12.02.2016 14:04:23:
> > Von: Simon Wunderlich <sw@simonwunderlich.de>
> > An: Andreas Pape <APape@phoenixcontact.com>
> > Kopie: The list for a Better Approach To Mobile Ad-hoc Networking
> > <b.a.t.m.a.n@lists.open-mesh.org>
> > Datum: 12.02.2016 14:04
> > Betreff: Re: Antwort: Re: Re: [B.A.T.M.A.N.] Antwort: Re: Looping
> > unicast packets when using BLA
> > 
> > Hi Andreas,
> > 
> > On Friday 12 February 2016 11:40:21 Andreas Pape wrote:
> > > > > [...]
> > > > > using dat in combination with bla:
> > > > > 1.Broadcast ARP requests from the backbone network are handled by
> 
> each
> 
> > > > > gateway, leading to multiple dat adress resoultions in parallel.
> > > > 
> > > > That shouldn't be a problem on its own.
> > > 
> > > I think I wasn't precise enough concerning this point. I meant the
> 
> effect,
> 
> > > that
> > > a broadcast ARP coming from a common backbone reaches all gateways. If
> 
> now
> 
> > > accidentally
> > > several gateways can already answer that request due to dat, then the
> > > current code sends
> > > an arp reply from each gateway being able to answer. This broadcast
> 
> does
> 
> > > not even reach
> > > the mesh if all gateways can answer the request (as far as I have
> > > understood the code).
> > > Therefore broadcast handling in the mesh layer does not solve this
> > > problem.
> > 
> > Yes, we may have multiple gateways answering with an ARP reply. But how
> 
> is
> 
> > this a problem? It is redundant, yes, but its just a unicast sent back.
> 
> I
> 
> > don't see this a s problem yet ...
> 
> I would like to prevent duplicated packets as much as possible, even if
> they are unicast packets normally harmlexs for typical PC hardware. But I
> know of enough small embedded devices (sensors and stuff like that) which
> don't like that.....
> 

Thats a good point. In general it could be debated whether we prefer redundant 
replies to no replies at all. But I'd agree to your point, especially since 
having answers from different devices may confuse a switch because it thinks 
there is some mac flapping or worse, having answers from different ports.

>> [...]
> 
> I've just sent the patches. They have the state of my "experiments" last
> year. That means that your latest proposal is not integrated yet.
> I quickly updated my devices in my test setup and it looks good (no
> looping arp requests or multiple replies seen so far).

Thanks a lot! I've reviewed them, we still have some formatting work to do so 
please bear with us with the iterations. Splitting and cleaning them up was 
definitely a great start, this is a very good contribution. :)

Thanks,
     Simon

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-02-15  8:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-08 11:35 [B.A.T.M.A.N.] Looping unicast packets when using BLA Andreas Pape
2016-02-08 12:29 ` Simon Wunderlich
     [not found] ` <4917381.eeOl7B1qNb-2016-02-09-07-20-04@prime>
2016-02-09  7:01   ` [B.A.T.M.A.N.] Antwort: " Andreas Pape
2016-02-11  9:19     ` Andreas Pape
2016-02-11 11:08       ` Simon Wunderlich
2016-02-12 10:40         ` [B.A.T.M.A.N.] Antwort: Re: " Andreas Pape
2016-02-12 13:04           ` Simon Wunderlich
2016-02-12 14:07             ` [B.A.T.M.A.N.] Antwort: " Andreas Pape
2016-02-15  8:54               ` Simon Wunderlich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).