All of lore.kernel.org
 help / color / mirror / Atom feed
* [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
@ 2012-07-02 13:30 Guido Iribarren
  2012-07-02 13:57 ` Guido Iribarren
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Guido Iribarren @ 2012-07-02 13:30 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

[-- Attachment #1: Type: text/plain, Size: 2635 bytes --]

(which roughly translates as "batman gone nuts?")
Hey great devs!
we've been having a particular issue in deltalibre and quintanalibre
(local WCN) with batman-adv, but so far we haven't found a precise way
to reproduce it.
The symptom is that (after some reboots or physical displacements?)
one batman-adv host becomes unreachable on layer3, although it is seen
on originators table, and can be batctl ping'ed or batctl tracerout'ed
with no problem whatsoever.

Even more, it not unreachable from the whole network, but instead from
just a few other nodes. So, let's say that the nearer nodes can layer3
ping it , but some others farther away cannot (although i can't assure
it depends on the hop distance)
All of them can batctl ping it (layer2)
A hard reboot of all the nodes solves it, connectivity is restored in
all directions.

Thing is, I've just came across it again, and managed to do some tests
to aid in description / debugging
As an aid in understanding network topology,
I'm attaching the wonderful output of "batctl vd dot |grep -v TT" for
your viewing delight

problem node is ana
it can be reached from ruth and hquilla (direct neighbours)
but arping behaves erratically from colmena or charly
and normal ping (v4 or v6) doesn't receive any reply at all when run
from colmena or charly

I used arping, with and without -b , and seemed like i could narrow
the problem down to incoming broadcast packet handling, but further
tests just left me more puzzled!

all nodes are tl-mr3220 running openwrt trunk r31316 with batman-adv
2012.2.0 , driver ath9k
secondary interfaces named _wlan1 are all tl-wn722n which uses driver ath9k_htc
nodes are around 100meters (+/-50mts) apart from each other

this behaviour has been observed (but not reported) in dissimilar
setups, using ubnt bullet2 mixed with mr3220, running r29936 with
batman-adv 2011.4.0 , with nodes 1 or 2km apart from each other.

Tests are the combined crude output of batctl td and arping, so to
make this email ease on the eye, i'm publishing them elsewhere:
http://pastebin.com/6PPwN3PS

The live openwrt configuration can be analysed in detail at
https://bitbucket.org/guidoi/deltalibre-configs/src
(it's a free, open network after all! :D )
in particular:
ana -> https://bitbucket.org/guidoi/deltalibre-configs/src/6de4ce970fe2/mac/54_E6_FC_BE_26_12
hquilla -> https://bitbucket.org/guidoi/deltalibre-configs/src/6de4ce970fe2/mac/54_E6_FC_BE_28_34
colmena -> https://bitbucket.org/guidoi/deltalibre-configs/src/6de4ce970fe2/mac/54_E6_FC_BE_29_D2

Thanks a lot for the attention,
Hope that you are having fun, and that I'm not spoiling it :)

Cheers!

Gui

[-- Attachment #2: dl_2012-07-02_07:29:04.png --]
[-- Type: image/png, Size: 140912 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-02 13:30 [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping Guido Iribarren
@ 2012-07-02 13:57 ` Guido Iribarren
  2012-07-02 14:36   ` Antonio Quartulli
  2012-07-02 16:39 ` Gioacchino Mazzurco
  2012-07-21 18:54 ` Gioacchino Mazzurco
  2 siblings, 1 reply; 24+ messages in thread
From: Guido Iribarren @ 2012-07-02 13:57 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

> I used arping, with and without -b , and seemed like i could narrow
> the problem down to incoming broadcast packet handling, but further
> tests just left me more puzzled!

Well, seems colmena is the uncooperative bathost
another log:
http://pastebin.com/FMD9Lieq
that can be summarized as follows

### From COLMENA-CASA, can ping bochita but not ana
### From PEREYRA, can ping bochita but not ana
### From COLMENA, works perfect to both destinations

colmena-casa and pereyra must pass through colmena, which is for some
reason allowing batctl pings , ogms , and whatnot passthrough in its
way to ana, but no ICMP echo requests, or tcp traffic whatsoever if
it's final destination is ana.
if final destination is bochita, everything works as expected.

Any ideas?

I'm going to delay rebooting colmena as long as i can, in case someone
comes up with an insightful test to run :)

Gui

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-02 13:57 ` Guido Iribarren
@ 2012-07-02 14:36   ` Antonio Quartulli
  2012-07-02 14:47     ` Guido Iribarren
                       ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Antonio Quartulli @ 2012-07-02 14:36 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

[-- Attachment #1: Type: text/plain, Size: 1501 bytes --]

On Mon, Jul 02, 2012 at 10:57:57AM -0300, Guido Iribarren wrote:
> > I used arping, with and without -b , and seemed like i could narrow
> > the problem down to incoming broadcast packet handling, but further
> > tests just left me more puzzled!
> 
> Well, seems colmena is the uncooperative bathost
> another log:
> http://pastebin.com/FMD9Lieq
> that can be summarized as follows
> 
> ### From COLMENA-CASA, can ping bochita but not ana
> ### From PEREYRA, can ping bochita but not ana
> ### From COLMENA, works perfect to both destinations
> 
> colmena-casa and pereyra must pass through colmena, which is for some
> reason allowing batctl pings , ogms , and whatnot passthrough in its
> way to ana, but no ICMP echo requests, or tcp traffic whatsoever if
> it's final destination is ana.
> if final destination is bochita, everything works as expected.
> 
> Any ideas?
> 
> I'm going to delay rebooting colmena as long as i can, in case someone
> comes up with an insightful test to run :)

Hello!

Has debug support been compiled in batman-adv? IF yes, it would be interesting
so see the output of the tt log (batctl ll tt; batctl l)

Recently we fixed a bug that which fix has not been released yet. If we are sure
that this is the cause, you could eventually try an upgrade to a more recente
dev-version. But let's see the log first (if possible)

Cheers,

> 
> Gui

-- 
Antonio Quartulli

..each of us alone is worth nothing..
Ernesto "Che" Guevara

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-02 14:36   ` Antonio Quartulli
@ 2012-07-02 14:47     ` Guido Iribarren
  2012-07-02 15:52     ` Marek Lindner
  2012-07-20 20:25     ` Guido Iribarren
  2 siblings, 0 replies; 24+ messages in thread
From: Guido Iribarren @ 2012-07-02 14:47 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

Hello Antonio!
thanks for your time,

On Mon, Jul 2, 2012 at 11:36 AM, Antonio Quartulli <ordex@autistici.org> wrote:
> Hello!
>
> Has debug support been compiled in batman-adv? IF yes, it would be interesting
> so see the output of the tt log (batctl ll tt; batctl l)

unfortunately, no :(

root@colmena:~# batctl ll
Error - can't open file '/sys/class/net/bat0/mesh/log_level': No such
file or directory
The option you called seems not to be compiled into your batman-adv
kernel module.

Will compile that option on next firmware cooking :)

> Recently we fixed a bug that which fix has not been released yet. If we are sure
> that this is the cause, you could eventually try an upgrade to a more recente
> dev-version. But let's see the log first (if possible)

Problem is, it's not easy to reproduce. I haven't came across it for
several weeks. Nicolas Echaniz told me he suffered it recently, but i
don't think neither of us can spend the time to try to recreate it on
purpose :(

An enabled debug support waiting for the bug to crop up will probably
be the best we can wait for :)

Thanks!

Gui

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-02 14:36   ` Antonio Quartulli
  2012-07-02 14:47     ` Guido Iribarren
@ 2012-07-02 15:52     ` Marek Lindner
  2012-07-02 16:11       ` Guido Iribarren
  2012-07-20 20:25     ` Guido Iribarren
  2 siblings, 1 reply; 24+ messages in thread
From: Marek Lindner @ 2012-07-02 15:52 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

On Monday, July 02, 2012 16:36:04 Antonio Quartulli wrote:
> Recently we fixed a bug that which fix has not been released yet. If we are
> sure that this is the cause, you could eventually try an upgrade to a more
> recente dev-version. But let's see the log first (if possible)

You don't need the development version. I pushed these fixes into the latest 
batman-adv trunk package. If you update your package you should get them.

Cheers,
Marek

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-02 15:52     ` Marek Lindner
@ 2012-07-02 16:11       ` Guido Iribarren
  2012-07-02 16:26         ` Marek Lindner
  0 siblings, 1 reply; 24+ messages in thread
From: Guido Iribarren @ 2012-07-02 16:11 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

Hi Marek!
Just to confirm and avoid useless compiling
PKG_VERSION:=2012.2.0
BATCTL_VERSION:=2012.2.0
PKG_MD5SUM:=68967ed1df709de18ab795722dde9341
BATCTL_MD5SUM:=7abd284098c514d3f2858e8a956c495e

~/trunk/feeds/packages/net/batman-adv$ svn info .
Path: .
URL: svn://svn.openwrt.org/openwrt/packages/net/batman-adv
Repository Root: svn://svn.openwrt.org/openwrt
Repository UUID: 3c298f89-4303-0410-b956-a3cf2f4a3e73
Revision: 32578
Node Kind: directory
Schedule: normal
Last Changed Author: marek
Last Changed Rev: 32578
Last Changed Date: 2012-07-02 12:51:27 -0300 (Mon, 02 Jul 2012)

Given the date and the author ;) I assume this rev should do the trick, right?

Thanks a lot!

Gui

On Mon, Jul 2, 2012 at 12:52 PM, Marek Lindner <lindner_marek@yahoo.de> wrote:
> On Monday, July 02, 2012 16:36:04 Antonio Quartulli wrote:
>> Recently we fixed a bug that which fix has not been released yet. If we are
>> sure that this is the cause, you could eventually try an upgrade to a more
>> recente dev-version. But let's see the log first (if possible)
>
> You don't need the development version. I pushed these fixes into the latest
> batman-adv trunk package. If you update your package you should get them.
>
> Cheers,
> Marek

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-02 16:11       ` Guido Iribarren
@ 2012-07-02 16:26         ` Marek Lindner
  0 siblings, 0 replies; 24+ messages in thread
From: Marek Lindner @ 2012-07-02 16:26 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

On Monday, July 02, 2012 18:11:24 Guido Iribarren wrote:
> Hi Marek!
> Just to confirm and avoid useless compiling
> PKG_VERSION:=2012.2.0
> BATCTL_VERSION:=2012.2.0
> PKG_MD5SUM:=68967ed1df709de18ab795722dde9341
> BATCTL_MD5SUM:=7abd284098c514d3f2858e8a956c495e
> 
> ~/trunk/feeds/packages/net/batman-adv$ svn info .
> Path: .
> URL: svn://svn.openwrt.org/openwrt/packages/net/batman-adv
> Repository Root: svn://svn.openwrt.org/openwrt
> Repository UUID: 3c298f89-4303-0410-b956-a3cf2f4a3e73
> Revision: 32578
> Node Kind: directory
> Schedule: normal
> Last Changed Author: marek
> Last Changed Rev: 32578
> Last Changed Date: 2012-07-02 12:51:27 -0300 (Mon, 02 Jul 2012)
> 
> Given the date and the author ;) I assume this rev should do the trick,
> right?

Yes, that looks about right. If you wish to update the package and not the 
full image you should update one more time because Jow reminded me to increase 
the packet version. 

Cheers,
Marek

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-02 13:30 [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping Guido Iribarren
  2012-07-02 13:57 ` Guido Iribarren
@ 2012-07-02 16:39 ` Gioacchino Mazzurco
  2012-07-02 16:42   ` Antonio Quartulli
  2012-07-21 18:54 ` Gioacchino Mazzurco
  2 siblings, 1 reply; 24+ messages in thread
From: Gioacchino Mazzurco @ 2012-07-02 16:39 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

That bug was happening in Pisa some times
I have discussed about that antonio too

hope more test case can help to understand what is happening!

On 07/02/12 15:30, Guido Iribarren wrote:
> (which roughly translates as "batman gone nuts?")
> Hey great devs!
> we've been having a particular issue in deltalibre and quintanalibre
> (local WCN) with batman-adv, but so far we haven't found a precise way
> to reproduce it.
> The symptom is that (after some reboots or physical displacements?)
> one batman-adv host becomes unreachable on layer3, although it is seen
> on originators table, and can be batctl ping'ed or batctl tracerout'ed
> with no problem whatsoever.
> 
> Even more, it not unreachable from the whole network, but instead from
> just a few other nodes. So, let's say that the nearer nodes can layer3
> ping it , but some others farther away cannot (although i can't assure
> it depends on the hop distance)
> All of them can batctl ping it (layer2)
> A hard reboot of all the nodes solves it, connectivity is restored in
> all directions.
> 
> Thing is, I've just came across it again, and managed to do some tests
> to aid in description / debugging
> As an aid in understanding network topology,
> I'm attaching the wonderful output of "batctl vd dot |grep -v TT" for
> your viewing delight
> 
> problem node is ana
> it can be reached from ruth and hquilla (direct neighbours)
> but arping behaves erratically from colmena or charly
> and normal ping (v4 or v6) doesn't receive any reply at all when run
> from colmena or charly
> 
> I used arping, with and without -b , and seemed like i could narrow
> the problem down to incoming broadcast packet handling, but further
> tests just left me more puzzled!
> 
> all nodes are tl-mr3220 running openwrt trunk r31316 with batman-adv
> 2012.2.0 , driver ath9k
> secondary interfaces named _wlan1 are all tl-wn722n which uses driver ath9k_htc
> nodes are around 100meters (+/-50mts) apart from each other
> 
> this behaviour has been observed (but not reported) in dissimilar
> setups, using ubnt bullet2 mixed with mr3220, running r29936 with
> batman-adv 2011.4.0 , with nodes 1 or 2km apart from each other.
> 
> Tests are the combined crude output of batctl td and arping, so to
> make this email ease on the eye, i'm publishing them elsewhere:
> http://pastebin.com/6PPwN3PS
> 
> The live openwrt configuration can be analysed in detail at
> https://bitbucket.org/guidoi/deltalibre-configs/src
> (it's a free, open network after all! :D )
> in particular:
> ana -> https://bitbucket.org/guidoi/deltalibre-configs/src/6de4ce970fe2/mac/54_E6_FC_BE_26_12
> hquilla -> https://bitbucket.org/guidoi/deltalibre-configs/src/6de4ce970fe2/mac/54_E6_FC_BE_28_34
> colmena -> https://bitbucket.org/guidoi/deltalibre-configs/src/6de4ce970fe2/mac/54_E6_FC_BE_29_D2
> 
> Thanks a lot for the attention,
> Hope that you are having fun, and that I'm not spoiling it :)
> 
> Cheers!
> 
> Gui

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-02 16:39 ` Gioacchino Mazzurco
@ 2012-07-02 16:42   ` Antonio Quartulli
  2012-07-03  7:34     ` Nicolás Echániz
  0 siblings, 1 reply; 24+ messages in thread
From: Antonio Quartulli @ 2012-07-02 16:42 UTC (permalink / raw)
  To: Gioacchino Mazzurco
  Cc: The list for a Better Approach To Mobile Ad-hoc Networking

[-- Attachment #1: Type: text/plain, Size: 416 bytes --]

On Mon, Jul 02, 2012 at 06:39:49PM +0200, Gioacchino Mazzurco wrote:
> That bug was happening in Pisa some times
> I have discussed about that antonio too

yeah, it was pretty much the same!
I hope Guido can give us good results after testing the new patches :-)

You may want to give them a try too?? :):)

Cheers,

-- 
Antonio Quartulli

..each of us alone is worth nothing..
Ernesto "Che" Guevara

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-02 16:42   ` Antonio Quartulli
@ 2012-07-03  7:34     ` Nicolás Echániz
  2012-07-03  7:52       ` Wayne Abroue
  0 siblings, 1 reply; 24+ messages in thread
From: Nicolás Echániz @ 2012-07-03  7:34 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

On 07/02/2012 01:42 PM, Antonio Quartulli wrote:
> On Mon, Jul 02, 2012 at 06:39:49PM +0200, Gioacchino Mazzurco wrote:
>> That bug was happening in Pisa some times
>> I have discussed about that antonio too
> 
> yeah, it was pretty much the same!
> I hope Guido can give us good results after testing the new patches :-)
> 
> You may want to give them a try too?? :):)

I just wanted to confirm that I've come across this bug quite often but
my setup is less tidy than guido's so it's more complex to debug.

I can add that quite recently we started in a nearby town a new WCN
project and we hit this bug the same day we put the first two nodes
online; they could bat-ping alright but no ping at all. All started
working after a reboot of one of the nodes.

Guido noted that this bug is frequently apparent when we configure a
node in some point of the mesh (an admin's home for instance) and then
move this node to it's final location.
If I get the time to do so I'll try to test if this is really the case
or just a coincidence so far.


Cheers,
NicoEchániz

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-03  7:34     ` Nicolás Echániz
@ 2012-07-03  7:52       ` Wayne Abroue
  2012-07-03  8:07         ` Marek Lindner
  0 siblings, 1 reply; 24+ messages in thread
From: Wayne Abroue @ 2012-07-03  7:52 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

On Tue, Jul 3, 2012 at 9:34 AM, Nicolás Echániz
<nicoechaniz@codigosur.org> wrote:
> On 07/02/2012 01:42 PM, Antonio Quartulli wrote:
>> On Mon, Jul 02, 2012 at 06:39:49PM +0200, Gioacchino Mazzurco wrote:
>>> That bug was happening in Pisa some times
>>> I have discussed about that antonio too
>>
>> yeah, it was pretty much the same!
>> I hope Guido can give us good results after testing the new patches :-)
>>
>> You may want to give them a try too?? :):)
>
> I just wanted to confirm that I've come across this bug quite often but
> my setup is less tidy than guido's so it's more complex to debug.
>
> I can add that quite recently we started in a nearby town a new WCN
> project and we hit this bug the same day we put the first two nodes
> online; they could bat-ping alright but no ping at all. All started
> working after a reboot of one of the nodes.
>
> Guido noted that this bug is frequently apparent when we configure a
> node in some point of the mesh (an admin's home for instance) and then
> move this node to it's final location.
> If I get the time to do so I'll try to test if this is really the case
> or just a coincidence so far.
>

Admittedly, using the older version, In my 25 node mesh, I have also
wondered why nodes seemingly disappear without trace when doing a
nmap. As L2 throughput still  works I haven't bothered to investigate.
On the upgrade note, Is there a way to upgrade to 2012 without
reflashing the node?


Wayne A

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-03  7:52       ` Wayne Abroue
@ 2012-07-03  8:07         ` Marek Lindner
  2012-07-03  8:27           ` Wayne Abroue
  0 siblings, 1 reply; 24+ messages in thread
From: Marek Lindner @ 2012-07-03  8:07 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

On Tuesday, July 03, 2012 09:52:09 Wayne Abroue wrote:
> Admittedly, using the older version, In my 25 node mesh, I have also
> wondered why nodes seemingly disappear without trace when doing a
> nmap. As L2 throughput still  works I haven't bothered to investigate.
> On the upgrade note, Is there a way to upgrade to 2012 without
> reflashing the node?

You can build a new package and install that. Note that you should build this 
package with the exact same build environment you currently have running.

Regards,
Marek

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-03  8:07         ` Marek Lindner
@ 2012-07-03  8:27           ` Wayne Abroue
  2012-07-03  8:37             ` Marek Lindner
  0 siblings, 1 reply; 24+ messages in thread
From: Wayne Abroue @ 2012-07-03  8:27 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

On Tue, Jul 3, 2012 at 10:07 AM, Marek Lindner <lindner_marek@yahoo.de> wrote:
> On Tuesday, July 03, 2012 09:52:09 Wayne Abroue wrote:
>> Admittedly, using the older version, In my 25 node mesh, I have also
>> wondered why nodes seemingly disappear without trace when doing a
>> nmap. As L2 throughput still  works I haven't bothered to investigate.
>> On the upgrade note, Is there a way to upgrade to 2012 without
>> reflashing the node?
>
> You can build a new package and install that. Note that you should build this
> package with the exact same build environment you currently have running.
>

Thanks Marek, Unfortunately all my nodes run one or other default
openwrt version depending  on ubnt/Mp/wrt  driver compat .
Would it maybe be viable to add a package to older versions of openwrt
repo's i.e. Batman-adv-new_stable?
To make upgrading a easier exercise for us non-build orientated types.

Wayne








> Regards,
> Marek

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-03  8:27           ` Wayne Abroue
@ 2012-07-03  8:37             ` Marek Lindner
  0 siblings, 0 replies; 24+ messages in thread
From: Marek Lindner @ 2012-07-03  8:37 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

On Tuesday, July 03, 2012 10:27:55 Wayne Abroue wrote:
> Thanks Marek, Unfortunately all my nodes run one or other default
> openwrt version depending  on ubnt/Mp/wrt  driver compat .
> Would it maybe be viable to add a package to older versions of openwrt
> repo's i.e. Batman-adv-new_stable?

I don't quite follow you. Your nodes run pre-compiled images from the openwrt 
snapshot directory ? If so, there isn't much we can do. As far as I know the 
openwrt team builds snapshots from time to time. Whenever they do they also 
upgrade the entire platform - these package may or may not be backward 
compatible. 

Take this with a grain of salt. I don't really know how they are doing it. You 
should contact the OpenWrt developers because you use their binaries (unless I 
am totally on the wrong track).

Regards,
Marek

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-02 14:36   ` Antonio Quartulli
  2012-07-02 14:47     ` Guido Iribarren
  2012-07-02 15:52     ` Marek Lindner
@ 2012-07-20 20:25     ` Guido Iribarren
  2012-07-21 21:38       ` Antonio Quartulli
  2 siblings, 1 reply; 24+ messages in thread
From: Guido Iribarren @ 2012-07-20 20:25 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

Resurrecting thread...

On Mon, Jul 2, 2012 at 11:36 AM, Antonio Quartulli <ordex@autistici.org> wrote:
> Hello!
>
> Has debug support been compiled in batman-adv? IF yes, it would be interesting
> so see the output of the tt log (batctl ll tt; batctl l)
Ah, I should have re-read this before :(


> Recently we fixed a bug that which fix has not been released yet. If we are sure
> that this is the cause, you could eventually try an upgrade to a more recente
> dev-version. But let's see the log first (if possible)
> --
> Antonio Quartulli

Last week I came across this bug again, with the latest firm which
includes the fixes mentioned, pushed by Marek.
We were kinda in a hurry so i didn't have much time to check it
thoroughly, so there's a *slim* chance it was just a coincidence, such
as very poor signal giving erratic results.
But if I recall correctly Nico Echaniz did stump on this too, using
the latest firm.
So, although i can't confirm it 100%, it seems so far the fixes didn't help :(

We'll keep an eye on it and try a "batctl l"

Cheers!

Gui

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-02 13:30 [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping Guido Iribarren
  2012-07-02 13:57 ` Guido Iribarren
  2012-07-02 16:39 ` Gioacchino Mazzurco
@ 2012-07-21 18:54 ` Gioacchino Mazzurco
  2012-07-21 21:40   ` Antonio Quartulli
  2 siblings, 1 reply; 24+ messages in thread
From: Gioacchino Mazzurco @ 2012-07-21 18:54 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

Same bug today in ninux pisa after a node was turned off the entire
network became crazy for 2 hours, to solve i had to restart a lot of
nodes... :|

On 07/02/12 15:30, Guido Iribarren wrote:
> (which roughly translates as "batman gone nuts?")
> Hey great devs!
> we've been having a particular issue in deltalibre and quintanalibre
> (local WCN) with batman-adv, but so far we haven't found a precise way
> to reproduce it.
> The symptom is that (after some reboots or physical displacements?)
> one batman-adv host becomes unreachable on layer3, although it is seen
> on originators table, and can be batctl ping'ed or batctl tracerout'ed
> with no problem whatsoever.
> 
> Even more, it not unreachable from the whole network, but instead from
> just a few other nodes. So, let's say that the nearer nodes can layer3
> ping it , but some others farther away cannot (although i can't assure
> it depends on the hop distance)
> All of them can batctl ping it (layer2)
> A hard reboot of all the nodes solves it, connectivity is restored in
> all directions.
> 
> Thing is, I've just came across it again, and managed to do some tests
> to aid in description / debugging
> As an aid in understanding network topology,
> I'm attaching the wonderful output of "batctl vd dot |grep -v TT" for
> your viewing delight
> 
> problem node is ana
> it can be reached from ruth and hquilla (direct neighbours)
> but arping behaves erratically from colmena or charly
> and normal ping (v4 or v6) doesn't receive any reply at all when run
> from colmena or charly
> 
> I used arping, with and without -b , and seemed like i could narrow
> the problem down to incoming broadcast packet handling, but further
> tests just left me more puzzled!
> 
> all nodes are tl-mr3220 running openwrt trunk r31316 with batman-adv
> 2012.2.0 , driver ath9k
> secondary interfaces named _wlan1 are all tl-wn722n which uses driver ath9k_htc
> nodes are around 100meters (+/-50mts) apart from each other
> 
> this behaviour has been observed (but not reported) in dissimilar
> setups, using ubnt bullet2 mixed with mr3220, running r29936 with
> batman-adv 2011.4.0 , with nodes 1 or 2km apart from each other.
> 
> Tests are the combined crude output of batctl td and arping, so to
> make this email ease on the eye, i'm publishing them elsewhere:
> http://pastebin.com/6PPwN3PS
> 
> The live openwrt configuration can be analysed in detail at
> https://bitbucket.org/guidoi/deltalibre-configs/src
> (it's a free, open network after all! :D )
> in particular:
> ana -> https://bitbucket.org/guidoi/deltalibre-configs/src/6de4ce970fe2/mac/54_E6_FC_BE_26_12
> hquilla -> https://bitbucket.org/guidoi/deltalibre-configs/src/6de4ce970fe2/mac/54_E6_FC_BE_28_34
> colmena -> https://bitbucket.org/guidoi/deltalibre-configs/src/6de4ce970fe2/mac/54_E6_FC_BE_29_D2
> 
> Thanks a lot for the attention,
> Hope that you are having fun, and that I'm not spoiling it :)
> 
> Cheers!
> 
> Gui

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-20 20:25     ` Guido Iribarren
@ 2012-07-21 21:38       ` Antonio Quartulli
  2012-07-22 10:57         ` Guido Iribarren
  0 siblings, 1 reply; 24+ messages in thread
From: Antonio Quartulli @ 2012-07-21 21:38 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

[-- Attachment #1: Type: text/plain, Size: 2052 bytes --]

Hi Guido,

On Fri, Jul 20, 2012 at 05:25:46PM -0300, Guido Iribarren wrote:
> Resurrecting thread...
> 
> On Mon, Jul 2, 2012 at 11:36 AM, Antonio Quartulli <ordex@autistici.org> wrote:
> > Hello!
> >
> > Has debug support been compiled in batman-adv? IF yes, it would be interesting
> > so see the output of the tt log (batctl ll tt; batctl l)
> Ah, I should have re-read this before :(
> 
> 
> > Recently we fixed a bug that which fix has not been released yet. If we are sure
> > that this is the cause, you could eventually try an upgrade to a more recente
> > dev-version. But let's see the log first (if possible)
> > --
> > Antonio Quartulli
> 
> Last week I came across this bug again, with the latest firm which
> includes the fixes mentioned, pushed by Marek.
> We were kinda in a hurry so i didn't have much time to check it
> thoroughly, so there's a *slim* chance it was just a coincidence, such
> as very poor signal giving erratic results.
> But if I recall correctly Nico Echaniz did stump on this too, using
> the latest firm.

How did you solve it then? Rebooting?

> So, although i can't confirm it 100%, it seems so far the fixes didn't help :(
> 
> We'll keep an eye on it and try a "batctl l"

Yes, please. Remember to set the TT log level (batctl ll tt) before launching
batctl l.
Actually it would be very interesting to see the log of the involved nodes
during the "wrong behaviour period".

However, please keep an eye on the log anyway and report if you get any
message matching "*inconsistency*" (but report the whole part of the log, not
only this message). When you see those messages, please be sure that no clients
is connecting at that time (if so, it could be the normal procedure). If you get
this message, you should also see which node is involved in the inconsistency
(it is reported in the message too) and please report the tt log from that node
too.

Thank you very much!


-- 
Antonio Quartulli

..each of us alone is worth nothing..
Ernesto "Che" Guevara

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-21 18:54 ` Gioacchino Mazzurco
@ 2012-07-21 21:40   ` Antonio Quartulli
  2012-07-22 10:54     ` Gioacchino Mazzurco
  0 siblings, 1 reply; 24+ messages in thread
From: Antonio Quartulli @ 2012-07-21 21:40 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

[-- Attachment #1: Type: text/plain, Size: 3664 bytes --]

Hello Gioacchino,

On Sat, Jul 21, 2012 at 08:54:05PM +0200, Gioacchino Mazzurco wrote:
> Same bug today in ninux pisa after a node was turned off the entire
> network became crazy for 2 hours, to solve i had to restart a lot of
> nodes... :|
> 

Which version are you using? The lastest openwrt package version (so with all
the new patches?)

Could you provide the log of the involved nodes whenever you get this problems?
I wrote something about the desired logs to Guido, you could follow the same
instruction. It would really be appreciated!

Thank you!


Cheers,

> On 07/02/12 15:30, Guido Iribarren wrote:
> > (which roughly translates as "batman gone nuts?")
> > Hey great devs!
> > we've been having a particular issue in deltalibre and quintanalibre
> > (local WCN) with batman-adv, but so far we haven't found a precise way
> > to reproduce it.
> > The symptom is that (after some reboots or physical displacements?)
> > one batman-adv host becomes unreachable on layer3, although it is seen
> > on originators table, and can be batctl ping'ed or batctl tracerout'ed
> > with no problem whatsoever.
> > 
> > Even more, it not unreachable from the whole network, but instead from
> > just a few other nodes. So, let's say that the nearer nodes can layer3
> > ping it , but some others farther away cannot (although i can't assure
> > it depends on the hop distance)
> > All of them can batctl ping it (layer2)
> > A hard reboot of all the nodes solves it, connectivity is restored in
> > all directions.
> > 
> > Thing is, I've just came across it again, and managed to do some tests
> > to aid in description / debugging
> > As an aid in understanding network topology,
> > I'm attaching the wonderful output of "batctl vd dot |grep -v TT" for
> > your viewing delight
> > 
> > problem node is ana
> > it can be reached from ruth and hquilla (direct neighbours)
> > but arping behaves erratically from colmena or charly
> > and normal ping (v4 or v6) doesn't receive any reply at all when run
> > from colmena or charly
> > 
> > I used arping, with and without -b , and seemed like i could narrow
> > the problem down to incoming broadcast packet handling, but further
> > tests just left me more puzzled!
> > 
> > all nodes are tl-mr3220 running openwrt trunk r31316 with batman-adv
> > 2012.2.0 , driver ath9k
> > secondary interfaces named _wlan1 are all tl-wn722n which uses driver ath9k_htc
> > nodes are around 100meters (+/-50mts) apart from each other
> > 
> > this behaviour has been observed (but not reported) in dissimilar
> > setups, using ubnt bullet2 mixed with mr3220, running r29936 with
> > batman-adv 2011.4.0 , with nodes 1 or 2km apart from each other.
> > 
> > Tests are the combined crude output of batctl td and arping, so to
> > make this email ease on the eye, i'm publishing them elsewhere:
> > http://pastebin.com/6PPwN3PS
> > 
> > The live openwrt configuration can be analysed in detail at
> > https://bitbucket.org/guidoi/deltalibre-configs/src
> > (it's a free, open network after all! :D )
> > in particular:
> > ana -> https://bitbucket.org/guidoi/deltalibre-configs/src/6de4ce970fe2/mac/54_E6_FC_BE_26_12
> > hquilla -> https://bitbucket.org/guidoi/deltalibre-configs/src/6de4ce970fe2/mac/54_E6_FC_BE_28_34
> > colmena -> https://bitbucket.org/guidoi/deltalibre-configs/src/6de4ce970fe2/mac/54_E6_FC_BE_29_D2
> > 
> > Thanks a lot for the attention,
> > Hope that you are having fun, and that I'm not spoiling it :)
> > 
> > Cheers!
> > 
> > Gui

-- 
Antonio Quartulli

..each of us alone is worth nothing..
Ernesto "Che" Guevara

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-21 21:40   ` Antonio Quartulli
@ 2012-07-22 10:54     ` Gioacchino Mazzurco
  0 siblings, 0 replies; 24+ messages in thread
From: Gioacchino Mazzurco @ 2012-07-22 10:54 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

[-- Attachment #1: Type: text/plain, Size: 3772 bytes --]

I'll compile batman-adv with debug support for next firmware update i
hope it will not affect performance too much...

On 07/21/12 23:40, Antonio Quartulli wrote:
> Hello Gioacchino,
> 
> On Sat, Jul 21, 2012 at 08:54:05PM +0200, Gioacchino Mazzurco wrote:
>> Same bug today in ninux pisa after a node was turned off the entire
>> network became crazy for 2 hours, to solve i had to restart a lot of
>> nodes... :|
>>
> 
> Which version are you using? The lastest openwrt package version (so with all
> the new patches?)
> 
> Could you provide the log of the involved nodes whenever you get this problems?
> I wrote something about the desired logs to Guido, you could follow the same
> instruction. It would really be appreciated!
> 
> Thank you!
> 
> 
> Cheers,
> 
>> On 07/02/12 15:30, Guido Iribarren wrote:
>>> (which roughly translates as "batman gone nuts?")
>>> Hey great devs!
>>> we've been having a particular issue in deltalibre and quintanalibre
>>> (local WCN) with batman-adv, but so far we haven't found a precise way
>>> to reproduce it.
>>> The symptom is that (after some reboots or physical displacements?)
>>> one batman-adv host becomes unreachable on layer3, although it is seen
>>> on originators table, and can be batctl ping'ed or batctl tracerout'ed
>>> with no problem whatsoever.
>>>
>>> Even more, it not unreachable from the whole network, but instead from
>>> just a few other nodes. So, let's say that the nearer nodes can layer3
>>> ping it , but some others farther away cannot (although i can't assure
>>> it depends on the hop distance)
>>> All of them can batctl ping it (layer2)
>>> A hard reboot of all the nodes solves it, connectivity is restored in
>>> all directions.
>>>
>>> Thing is, I've just came across it again, and managed to do some tests
>>> to aid in description / debugging
>>> As an aid in understanding network topology,
>>> I'm attaching the wonderful output of "batctl vd dot |grep -v TT" for
>>> your viewing delight
>>>
>>> problem node is ana
>>> it can be reached from ruth and hquilla (direct neighbours)
>>> but arping behaves erratically from colmena or charly
>>> and normal ping (v4 or v6) doesn't receive any reply at all when run
>>> from colmena or charly
>>>
>>> I used arping, with and without -b , and seemed like i could narrow
>>> the problem down to incoming broadcast packet handling, but further
>>> tests just left me more puzzled!
>>>
>>> all nodes are tl-mr3220 running openwrt trunk r31316 with batman-adv
>>> 2012.2.0 , driver ath9k
>>> secondary interfaces named _wlan1 are all tl-wn722n which uses driver ath9k_htc
>>> nodes are around 100meters (+/-50mts) apart from each other
>>>
>>> this behaviour has been observed (but not reported) in dissimilar
>>> setups, using ubnt bullet2 mixed with mr3220, running r29936 with
>>> batman-adv 2011.4.0 , with nodes 1 or 2km apart from each other.
>>>
>>> Tests are the combined crude output of batctl td and arping, so to
>>> make this email ease on the eye, i'm publishing them elsewhere:
>>> http://pastebin.com/6PPwN3PS
>>>
>>> The live openwrt configuration can be analysed in detail at
>>> https://bitbucket.org/guidoi/deltalibre-configs/src
>>> (it's a free, open network after all! :D )
>>> in particular:
>>> ana -> https://bitbucket.org/guidoi/deltalibre-configs/src/6de4ce970fe2/mac/54_E6_FC_BE_26_12
>>> hquilla -> https://bitbucket.org/guidoi/deltalibre-configs/src/6de4ce970fe2/mac/54_E6_FC_BE_28_34
>>> colmena -> https://bitbucket.org/guidoi/deltalibre-configs/src/6de4ce970fe2/mac/54_E6_FC_BE_29_D2
>>>
>>> Thanks a lot for the attention,
>>> Hope that you are having fun, and that I'm not spoiling it :)
>>>
>>> Cheers!
>>>
>>> Gui
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-21 21:38       ` Antonio Quartulli
@ 2012-07-22 10:57         ` Guido Iribarren
  2012-07-22 11:20           ` Guido Iribarren
  0 siblings, 1 reply; 24+ messages in thread
From: Guido Iribarren @ 2012-07-22 10:57 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

On Sat, Jul 21, 2012 at 6:38 PM, Antonio Quartulli <ordex@autistici.org> wrote:
>> Last week I came across this bug again, with the latest firm which
>> includes the fixes mentioned, pushed by Marek.
>> We were kinda in a hurry so i didn't have much time to check it
>> thoroughly, so there's a *slim* chance it was just a coincidence, such
>> as very poor signal giving erratic results.
>> But if I recall correctly Nico Echaniz did stump on this too, using
>> the latest firm.
>
> How did you solve it then? Rebooting?

A reboot did, yes.

>
>> So, although i can't confirm it 100%, it seems so far the fixes didn't help :(
>>
>> We'll keep an eye on it and try a "batctl l"
>
> Yes, please. Remember to set the TT log level (batctl ll tt) before launching
> batctl l.
> Actually it would be very interesting to see the log of the involved nodes
> during the "wrong behaviour period".

This time it solved itself after some brief time (a minute) but the
symptoms were the same.
So I could catch some logs,
http://pastebin.com/MEENj94i

sadly, i wasn't fast enough to get a live log from the node involved
in the inconsistency as you suggested, so the report might be pretty
useless.
But at least now I got an idea where we are heading :)


> Thank you very much!

Thanks a lot for your support people!

Gui

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-22 10:57         ` Guido Iribarren
@ 2012-07-22 11:20           ` Guido Iribarren
  2012-07-23 17:28             ` Antonio Quartulli
  0 siblings, 1 reply; 24+ messages in thread
From: Guido Iribarren @ 2012-07-22 11:20 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

On Sun, Jul 22, 2012 at 7:57 AM, Guido Iribarren
<guidoiribarren@buenosaireslibre.org> wrote:

>
> This time it solved itself after some brief time (a minute) but the
> symptoms were the same.
> So I could catch some logs,
> http://pastebin.com/MEENj94i
>
> sadly, i wasn't fast enough to get a live log from the node involved
> in the inconsistency as you suggested, so the report might be pretty
> useless.

from this particular node i ran previous report (colmena-casa) that
was rebooted recently, L3 ping to all of the network had the same
issue, (no replies for a minute or so) so i had the chance to
"recreate" the situation several times.
Turns out, a "batctl ll tt ; batctl l" on the nodes mentioned in the
inconsistencies gave no output at all, so the previous pastebin report
is in fact complete :P
Looks like the inconsistency is being resolved locally between
neighbours, without the need to contact the far end of the network
(which is coherent with what's described in the wiki)

In any case, AFAIR previous ocurrences of the bug didn't resolve by
themselves (in a reasonable amount of time) so what I'm looking at now
might be perfectly normal behaviour? (tt tables take some time to
propagate?)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-22 11:20           ` Guido Iribarren
@ 2012-07-23 17:28             ` Antonio Quartulli
  2012-08-05  5:34               ` Gui Iribarren
  0 siblings, 1 reply; 24+ messages in thread
From: Antonio Quartulli @ 2012-07-23 17:28 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

[-- Attachment #1: Type: text/plain, Size: 1911 bytes --]

On Sun, Jul 22, 2012 at 08:20:21AM -0300, Guido Iribarren wrote:
> On Sun, Jul 22, 2012 at 7:57 AM, Guido Iribarren
> <guidoiribarren@buenosaireslibre.org> wrote:
> 
> >
> > This time it solved itself after some brief time (a minute) but the
> > symptoms were the same.
> > So I could catch some logs,
> > http://pastebin.com/MEENj94i
> >
> > sadly, i wasn't fast enough to get a live log from the node involved
> > in the inconsistency as you suggested, so the report might be pretty
> > useless.
> 
> from this particular node i ran previous report (colmena-casa) that
> was rebooted recently, L3 ping to all of the network had the same
> issue, (no replies for a minute or so) so i had the chance to
> "recreate" the situation several times.
> Turns out, a "batctl ll tt ; batctl l" on the nodes mentioned in the
> inconsistencies gave no output at all, so the previous pastebin report
> is in fact complete :P
> Looks like the inconsistency is being resolved locally between
> neighbours, without the need to contact the far end of the network
> (which is coherent with what's described in the wiki)

Exactly! If the neighbour has the needed information, the node can directly get
answered without bothering the real destination ;)

> 
> In any case, AFAIR previous ocurrences of the bug didn't resolve by
> themselves (in a reasonable amount of time) so what I'm looking at now
> might be perfectly normal behaviour? (tt tables take some time to
> propagate?)

Well, the log you posted is perfectly correct. You missed some OGMs, therefore
the node is asking for an update that he missed.

it would be interesting to run batctl ll tt; batctl l all the time on the node
that usually experiences the "problem". The log should be not so big, unless the
bug happens.

Cheers,


-- 
Antonio Quartulli

..each of us alone is worth nothing..
Ernesto "Che" Guevara

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-07-23 17:28             ` Antonio Quartulli
@ 2012-08-05  5:34               ` Gui Iribarren
  2012-08-05  7:58                 ` Antonio Quartulli
  0 siblings, 1 reply; 24+ messages in thread
From: Gui Iribarren @ 2012-08-05  5:34 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

On Mon, Jul 23, 2012 at 2:28 PM, Antonio Quartulli <ordex@autistici.org> wrote:
> On Sun, Jul 22, 2012 at 08:20:21AM -0300, Guido Iribarren wrote:
>> On Sun, Jul 22, 2012 at 7:57 AM, Guido Iribarren
>> <guidoiribarren@buenosaireslibre.org> wrote:
>>
>> >
>> > This time it solved itself after some brief time (a minute) but the
>> > symptoms were the same.
>> > So I could catch some logs,
>> > http://pastebin.com/MEENj94i
>> >
>> > sadly, i wasn't fast enough to get a live log from the node involved
>> > in the inconsistency as you suggested, so the report might be pretty
>> > useless.
>>
>> from this particular node i ran previous report (colmena-casa) that
>> was rebooted recently, L3 ping to all of the network had the same
>> issue, (no replies for a minute or so) so i had the chance to
>> "recreate" the situation several times.
>> Turns out, a "batctl ll tt ; batctl l" on the nodes mentioned in the
>> inconsistencies gave no output at all, so the previous pastebin report
>> is in fact complete :P
>> Looks like the inconsistency is being resolved locally between
>> neighbours, without the need to contact the far end of the network
>> (which is coherent with what's described in the wiki)
>
> Exactly! If the neighbour has the needed information, the node can directly get
> answered without bothering the real destination ;)
>
>>
>> In any case, AFAIR previous ocurrences of the bug didn't resolve by
>> themselves (in a reasonable amount of time) so what I'm looking at now
>> might be perfectly normal behaviour? (tt tables take some time to
>> propagate?)
>
> Well, the log you posted is perfectly correct. You missed some OGMs, therefore
> the node is asking for an update that he missed.
>
> it would be interesting to run batctl ll tt; batctl l all the time on the node
> that usually experiences the "problem". The log should be not so big, unless the
> bug happens.

I admit i haven't left this running as instructed, but on the other
hand, so far I haven't come across the original bug again, and a few
days ago I asked Nico Echaniz which confirmed that he's not suffering
it as previously.
he does bump from time to time with [a few moments | a few minutes] of
"nodes majaretas" (at first sight) but it resolves by itself
quickly[*], which indicates normal behaviour, of missing OGMs and
consequently a delay in TT table updating, as you explained.

[*] "quickly" means under 15 minutes , at most. Previously, problem
would never resolve by itself, being L3-unreachable for hours or days
until manual reboot was done.

In conclusion, so far so good, i think we can close this as fixed for
lack of evidence stating the contrary, heh.
I hope gioacchino managed to recompile ninux images and is having the
same stableness as we do :)

Gui

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping
  2012-08-05  5:34               ` Gui Iribarren
@ 2012-08-05  7:58                 ` Antonio Quartulli
  0 siblings, 0 replies; 24+ messages in thread
From: Antonio Quartulli @ 2012-08-05  7:58 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking

[-- Attachment #1: Type: text/plain, Size: 3494 bytes --]

On Sun, Aug 05, 2012 at 02:34:15AM -0300, Gui Iribarren wrote:
> On Mon, Jul 23, 2012 at 2:28 PM, Antonio Quartulli <ordex@autistici.org> wrote:
> > On Sun, Jul 22, 2012 at 08:20:21AM -0300, Guido Iribarren wrote:
> >> On Sun, Jul 22, 2012 at 7:57 AM, Guido Iribarren
> >> <guidoiribarren@buenosaireslibre.org> wrote:
> >>
> >> >
> >> > This time it solved itself after some brief time (a minute) but the
> >> > symptoms were the same.
> >> > So I could catch some logs,
> >> > http://pastebin.com/MEENj94i
> >> >
> >> > sadly, i wasn't fast enough to get a live log from the node involved
> >> > in the inconsistency as you suggested, so the report might be pretty
> >> > useless.
> >>
> >> from this particular node i ran previous report (colmena-casa) that
> >> was rebooted recently, L3 ping to all of the network had the same
> >> issue, (no replies for a minute or so) so i had the chance to
> >> "recreate" the situation several times.
> >> Turns out, a "batctl ll tt ; batctl l" on the nodes mentioned in the
> >> inconsistencies gave no output at all, so the previous pastebin report
> >> is in fact complete :P
> >> Looks like the inconsistency is being resolved locally between
> >> neighbours, without the need to contact the far end of the network
> >> (which is coherent with what's described in the wiki)
> >
> > Exactly! If the neighbour has the needed information, the node can directly get
> > answered without bothering the real destination ;)
> >
> >>
> >> In any case, AFAIR previous ocurrences of the bug didn't resolve by
> >> themselves (in a reasonable amount of time) so what I'm looking at now
> >> might be perfectly normal behaviour? (tt tables take some time to
> >> propagate?)
> >
> > Well, the log you posted is perfectly correct. You missed some OGMs, therefore
> > the node is asking for an update that he missed.
> >
> > it would be interesting to run batctl ll tt; batctl l all the time on the node
> > that usually experiences the "problem". The log should be not so big, unless the
> > bug happens.
> 
> I admit i haven't left this running as instructed, but on the other
> hand, so far I haven't come across the original bug again, and a few
> days ago I asked Nico Echaniz which confirmed that he's not suffering
> it as previously.
> he does bump from time to time with [a few moments | a few minutes] of
> "nodes majaretas" (at first sight) but it resolves by itself
> quickly[*], which indicates normal behaviour, of missing OGMs and
> consequently a delay in TT table updating, as you explained.
> 
> [*] "quickly" means under 15 minutes , at most. Previously, problem
> would never resolve by itself, being L3-unreachable for hours or days
> until manual reboot was done.
> 
> In conclusion, so far so good, i think we can close this as fixed for
> lack of evidence stating the contrary, heh.
> I hope gioacchino managed to recompile ninux images and is having the
> same stableness as we do :)
> 
> Gui

Hello Guido and thank you for reporting back your results :) However, even if
the "behaviour" is good (table gets recovered and everything starts working
again) it is a bit strange that it takes 15 minutes to do so.

If you accidentally see the bug, it would be interesting to get the log of the
"non-working" node and see why it is taking so long.

Thank you very much!

Cheers,

-- 
Antonio Quartulli

..each of us alone is worth nothing..
Ernesto "Che" Guevara

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2012-08-05  7:58 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-02 13:30 [B.A.T.M.A.N.] batman majareta? I can batctl ping but not ping Guido Iribarren
2012-07-02 13:57 ` Guido Iribarren
2012-07-02 14:36   ` Antonio Quartulli
2012-07-02 14:47     ` Guido Iribarren
2012-07-02 15:52     ` Marek Lindner
2012-07-02 16:11       ` Guido Iribarren
2012-07-02 16:26         ` Marek Lindner
2012-07-20 20:25     ` Guido Iribarren
2012-07-21 21:38       ` Antonio Quartulli
2012-07-22 10:57         ` Guido Iribarren
2012-07-22 11:20           ` Guido Iribarren
2012-07-23 17:28             ` Antonio Quartulli
2012-08-05  5:34               ` Gui Iribarren
2012-08-05  7:58                 ` Antonio Quartulli
2012-07-02 16:39 ` Gioacchino Mazzurco
2012-07-02 16:42   ` Antonio Quartulli
2012-07-03  7:34     ` Nicolás Echániz
2012-07-03  7:52       ` Wayne Abroue
2012-07-03  8:07         ` Marek Lindner
2012-07-03  8:27           ` Wayne Abroue
2012-07-03  8:37             ` Marek Lindner
2012-07-21 18:54 ` Gioacchino Mazzurco
2012-07-21 21:40   ` Antonio Quartulli
2012-07-22 10:54     ` Gioacchino Mazzurco

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.