b.a.t.m.a.n.lists.open-mesh.org archive mirror
 help / color / mirror / Atom feed
* About Throughput in BATMAN_V
@ 2024-04-03 12:56 berkay.demirci
  2024-04-03 13:16 ` Marek Lindner
  0 siblings, 1 reply; 10+ messages in thread
From: berkay.demirci @ 2024-04-03 12:56 UTC (permalink / raw)
  To: b.a.t.m.a.n

Hello,

I have a question about Batman V's behavior about throughput. Batman doesn't seem to calculate throughput properly so we set it manually with throughput override, but then even when actual throughput of the active interface decreases, it doesn't switch to the other interface because it only considers the overriden value.

It only switches when the active interface stops receiving OGM's completely. I think if throughput was calculated properly this wouldn't be a problem so i want to ask why it's the way it is. Batman already has a tool called throughout meter, shouldn't it be used to continuously check the value?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: About Throughput in BATMAN_V
  2024-04-03 12:56 About Throughput in BATMAN_V berkay.demirci
@ 2024-04-03 13:16 ` Marek Lindner
  2024-04-04  7:00   ` berkay.demirci
  0 siblings, 1 reply; 10+ messages in thread
From: Marek Lindner @ 2024-04-03 13:16 UTC (permalink / raw)
  To: b.a.t.m.a.n, berkay.demirci

Hi,

> Batman doesn't seem to calculate throughput properly

please provide details about what you mean? What is the expected vs
calculated throughput and how have you determined the calculation is wrong?


> so we set it manually with throughput override, but then even when actual
> throughput of the active interface decreases, it doesn't switch to the other
> interface because it only considers the overriden value.

Please describe the steps how this problem can be reproduced.


> Batman already has a tool called throughout meter, shouldn't it be used to
> continuously check the value?

A number of patches were proposed to integrate the throughput meter as
fallback when the throughput can not be determined via other means:

https://lists.open-mesh.org/mailman3/hyperkitty/list/b.a.t.m.a.n@lists.open-mesh.org/thread/AJYTJOONPCJ2GSPSTOFTTO5TZQKYJGOZ/

There were a few open issues that require further work. If you are interested
in spending time on this subject, I am happy to provide assistance.

Cheers,
Marek




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: About Throughput in BATMAN_V
  2024-04-03 13:16 ` Marek Lindner
@ 2024-04-04  7:00   ` berkay.demirci
  2024-04-04 10:00     ` Marek Lindner
  0 siblings, 1 reply; 10+ messages in thread
From: berkay.demirci @ 2024-04-04  7:00 UTC (permalink / raw)
  To: b.a.t.m.a.n

We have two modems for each node and in one of them, expected throughput should be about 6 Mb/s for example, and in the other one it should be about 30 Mb/s. By using iperf and also throughput meter I can see that it's the case. But when they are added to batman with batctl if add, after typing batctl o, I see that the throughput values in both interfaces are 10000 instead. 

I looked at the interfaces with ethtool and the speed is 10000 Mb/s there for both too which is how batman must be measuring the throughput but this isn't good because it doesn't reflect the actual speed. If we use throughput override, it's fine at first but one of the modems has a shorter range so in our test where two nodes move away from each other, actual throughput gets decreased due to losses but batman still chooses the same interface due to the overriden value.

Basically I would prefer batman being able to change measured throughput dynamically.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: About Throughput in BATMAN_V
  2024-04-04  7:00   ` berkay.demirci
@ 2024-04-04 10:00     ` Marek Lindner
  2024-04-05  8:06       ` berkay.demirci
  0 siblings, 1 reply; 10+ messages in thread
From: Marek Lindner @ 2024-04-04 10:00 UTC (permalink / raw)
  To: b.a.t.m.a.n

Hi,

> We have two modems for each node and in one of them, expected throughput
> should be about 6 Mb/s for example, and in the other one it should be about
> 30 Mb/s. By using iperf and also throughput meter I can see that it's the
> case. But when they are added to batman with batctl if add, after typing
> batctl o, I see that the throughput values in both interfaces are 10000
> instead.
> 
> I looked at the interfaces with ethtool and the speed is 10000 Mb/s there
> for both too which is how batman must be measuring the throughput

correct. If the underlying interface provides a link speed via ethtool, batman 
uses the ethtool API to get the throughput value.


> If we use throughput override, it's fine at first but one of the modems has a
> shorter range so in our test where two nodes move away from each other,
> actual throughput gets decreased due to losses but batman still chooses the
> same interface due to the overriden value.

That is what the manual override is meant to do. A manual value that will 
override all dynamically determined values.

Can you explain what type if "modem" you are talking about? It is not clear 
why a modem depends on range. Or are you talking about a batman mesh 
connecting various modems? Please share the topology of your setup.

Is this somehow related to your earlier statement: "[..] but then even when 
actual throughput of the active interface decreases, it doesn't switch to the 
other interface because it only considers the overriden value." ?


> Basically I would prefer batman being able to change measured throughput
> dynamically.

if I understand correctly you are changing from "Batman doesn't seem to 
calculate throughput properly" to "measured throughput is preferable" ? There 
is no calculation issue with batman v?

Cheers,
Marek




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: About Throughput in BATMAN_V
  2024-04-04 10:00     ` Marek Lindner
@ 2024-04-05  8:06       ` berkay.demirci
  2024-04-08  8:28         ` Marek Lindner
  0 siblings, 1 reply; 10+ messages in thread
From: berkay.demirci @ 2024-04-05  8:06 UTC (permalink / raw)
  To: b.a.t.m.a.n

I mean batman is getting the right calculation from ethtool but the problem is the value from ethtool is not preferable as throughput can drop as two nodes move away from each other. After checking the batman code, I have a better understanding. Batman's throughput calculation for wifi interfaces is probably desirable because it is using cfg80211's expected_throughput. But we are connecting custom modems to ethernet interfaces so they aren't wlan interfaces so it is using the speed value from ethtool, which isn't always accurate. 

We also did tests in virtual environment and according to this commit https://git.open-mesh.org/batman-adv.git/commit/6e860b3d5e4147bafcda32bf9b3e769926f232c5, ethtool link speed detection used to be disabled for such cases but got reverted since automatic measurements aren't implemented. So, is throughput_meter fallback method that is being worked on right now supposed to be the automatic measurement for cases like this? Whatever the method is, dynamically calculating throughput is a must because like I said, one of our modems have a shorter distance range so it is faster when two nodes are close but as nodes move away, there is lots of packet loss so the real throughput drops as well, but with overriden throughput value stays the same.

BATMAN_IV doesn't have that problem due to considering packet loss but it is worse due to not taking throughput into account so we can't use that neither. If there is any way to take packet loss into account on BATMAN_V that I'm not aware of I would like to learn that, but I'm guessing probably not.

I see that the last patch for tp fallback was written in 2018, has there been no more progress since then? And what are the problems with it?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: About Throughput in BATMAN_V
  2024-04-05  8:06       ` berkay.demirci
@ 2024-04-08  8:28         ` Marek Lindner
  2024-04-15  8:20           ` Berkay Demirci
  0 siblings, 1 reply; 10+ messages in thread
From: Marek Lindner @ 2024-04-08  8:28 UTC (permalink / raw)
  To: b.a.t.m.a.n

On Friday, 5 April 2024 10:06:56 CEST berkay.demirci@protonmail.com wrote:
> We also did tests in virtual environment and according to this commit
> https://git.open-mesh.org/batman-adv.git/commit/6e860b3d5e4147bafcda32bf9b3
> e769926f232c5, ethtool link speed detection used to be disabled for such
> cases but got reverted since automatic measurements aren't implemented.

No. Batman-adv used the ethtool 'auto-negotiation' on/off state to decide 
whether the the ethtool throughput value should be trusted.

As the commit states, the 'auto-negotiation' state has no impact on whether 
the reported throughput value should be trusted. Auto-negotiation could be on 
or off and still the value is wrong.


> dynamically calculating throughput is a must because like I said, one of our
> modems have a shorter distance range so it is faster when two nodes are
> close but as nodes move away, there is lots of packet loss so the real
> throughput drops as well, but with overriden throughput value stays the
> same.

You keep keep mentioning "modems have distance", "move away", "ethtool", etc 
without having explained what your setup is. Without providing details about 
these "modems with range" and what interface types you are talking about, 
nobody can really comment on your setup.  


> I see that the last patch for tp fallback was written in 2018, has there
> been no more progress since then? And what are the problems with it?

The main obstacle is time & energy to work on the tp fallback integration. 
Open issues were mentioned in the responses to the various patches.

f you are interested n spending time on these patches, I am happy to provide 
assistance.

Cheers,
Marek




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: About Throughput in BATMAN_V
  2024-04-08  8:28         ` Marek Lindner
@ 2024-04-15  8:20           ` Berkay Demirci
  2024-04-15  9:13             ` Marek Lindner
  2024-05-19 14:33             ` Marek Lindner
  0 siblings, 2 replies; 10+ messages in thread
From: Berkay Demirci @ 2024-04-15  8:20 UTC (permalink / raw)
  To: b.a.t.m.a.n

[-- Attachment #1: Type: text/plain, Size: 2357 bytes --]

I attached a file that has graphs that show what is happening. Worksheets from THR5 up to THRS9 and LOWMGEN are the ones relevant and the difference between them is the period of ELP and OGM. But changing them didn't make much of a difference in the end so we can just look at THR5.

We normally have 4 routers but for the test we are simulating, we used network 1 whose throughput is 30000 kbps and network 3 whose throughput is 8000 kbps and we give those values to batman with throughput_override. Graph shows the amount of OGM and ELP packets sent by node-1 from network 1 and 3 that were able to reach node-2, also PDR (packet delivery ratio) which starts dropping heavily thanks to packet loss and then picks back up when switching to network 3. Node-2 connection NW graph shows which network is chosen by batman, it's network 1 at first, then 3 etc.

In the test scenario, two nodes move away from each other so packet loss increases over time but it increases more in network 1, and since the throughput values are overriden, batman still chooses that network based on that value. Only when OGM messages stop reaching in network 1, batman switches to network 3 and we see the PDR increasing to 1 immediately when that happens.

Basically we want batman to be able to switch earlier than that and that's why I asked about the throughput meter implementation because the batman overriden throughput value doesn't consider packet losses. Another idea we had was to manually change the throughput value via a script if packet loss increases too much or something like that, we haven't thought in detail yet. So I'm asking if you could have any suggestion that considers packet loss as well.

Otherwise, I'd also appreciate the assistance you could provide for the patches for tp fallback implementation. Does it work at all at its current state even with problems or is it not there yet?

The set up is that we have custom routers connected to ethernet ports of Ubuntu computers and in the simulation, they are set up with virtually. I don't think the details matter because without tp override ethtool is used and it just gives the maximum physical layer speed which Sven Eckelmann also mentions in the last reply in https://lists.open-mesh.org/mailman3/hyperkitty/list/b.a.t.m.a.n@lists.open-mesh.org/thread/AJYTJOONPCJ2GSPSTOFTTO5TZQKYJGOZ/

[-- Attachment #2: LINKRECORD_STATUS (1).xlsx --]
[-- Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet, Size: 734423 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: About Throughput in BATMAN_V
  2024-04-15  8:20           ` Berkay Demirci
@ 2024-04-15  9:13             ` Marek Lindner
  2024-04-15 18:27               ` Berkay Demirci
  2024-05-19 14:33             ` Marek Lindner
  1 sibling, 1 reply; 10+ messages in thread
From: Marek Lindner @ 2024-04-15  9:13 UTC (permalink / raw)
  To: b.a.t.m.a.n

On Monday, 15 April 2024 10:20:20 CEST Berkay Demirci wrote:
> I attached a file that has graphs that show what is happening. Worksheets
> from THR5 up to THRS9 and LOWMGEN are the ones relevant and the difference
> between them is the period of ELP and OGM.

Why have you decided to configure the ELP interval to 1s and OGM interval to 
0.5s? 

Cheers,
Marek





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: About Throughput in BATMAN_V
  2024-04-15  9:13             ` Marek Lindner
@ 2024-04-15 18:27               ` Berkay Demirci
  0 siblings, 0 replies; 10+ messages in thread
From: Berkay Demirci @ 2024-04-15 18:27 UTC (permalink / raw)
  To: b.a.t.m.a.n

Just to try different combinations to see if it made any difference, like could it maybe switch earlier. But either way, only when OGM's stop reaching node 2 in network 1, does it switch to network 3.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: About Throughput in BATMAN_V
  2024-04-15  8:20           ` Berkay Demirci
  2024-04-15  9:13             ` Marek Lindner
@ 2024-05-19 14:33             ` Marek Lindner
  1 sibling, 0 replies; 10+ messages in thread
From: Marek Lindner @ 2024-05-19 14:33 UTC (permalink / raw)
  To: b.a.t.m.a.n; +Cc: Berkay Demirci

On Monday, 15 April 2024 10:20:20 CEST Berkay Demirci wrote:
> In the test scenario, two nodes move away from each other so packet loss
> increases over time but it increases more in network 1, and since the
> throughput values are overriden, batman still chooses that network based on
> that value. Only when OGM messages stop reaching in network 1, batman
> switches to network 3 and we see the PDR increasing to 1 immediately when
> that happens.

Correct, the throughput override is a static value and does not adjust to a  
changing environment. On wireless interface the estimated throughput would be 
adjusted as the nodes move away from each other (batman-adv is able to read 
estimated throughput values from the WiFi driver).


> Basically we want batman to be able to switch earlier than that and that's
> why I asked about the throughput meter implementation because the batman
> overriden throughput value doesn't consider packet losses.

Exactly, the throughput override does not consider anything other than the 
configured value.


> Another idea we had was to manually change the throughput value via a script
> if packet loss increases too much or something like that, we haven't thought
> in detail yet. So I'm asking if you could have any suggestion that considers
> packet loss as well.

It seems you are attempting to simulate a wireless environment using wired 
devices. Wired devices typically can not "move away" from each other, hence 
you are running into this issue with your simulation approach. 

Maybe mac80211 hwsim is an option for you? I've never used it, so can't 
provide specific suggestions.

Have you considered testing on a wireless testbed?


> Otherwise, I'd also appreciate the assistance you could provide for the
> patches for tp fallback implementation. Does it work at all at its current
> state even with problems or is it not there yet?

Your question isn't entirely clear to me. As a first step, you'd have to rebase 
the tp meter patches to work on your chosen batman-adv version. How much work 
that might be is hard to ascertain without trying it.

Cheers,
Marek




^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-05-19 14:34 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-03 12:56 About Throughput in BATMAN_V berkay.demirci
2024-04-03 13:16 ` Marek Lindner
2024-04-04  7:00   ` berkay.demirci
2024-04-04 10:00     ` Marek Lindner
2024-04-05  8:06       ` berkay.demirci
2024-04-08  8:28         ` Marek Lindner
2024-04-15  8:20           ` Berkay Demirci
2024-04-15  9:13             ` Marek Lindner
2024-04-15 18:27               ` Berkay Demirci
2024-05-19 14:33             ` Marek Lindner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).