All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lars Everbrand <lars.everbrand@protonmail.com>
To: Jay Vosburgh <jay.vosburgh@canonical.com>
Cc: Jakub Kicinski <kuba@kernel.org>,
	linux-kernel@vger.kernel.org,
	Veaceslav Falico <vfalico@gmail.com>,
	Andy Gospodarek <andy@greyhouse.net>,
	"David S. Miller" <davem@davemloft.net>,
	netdev@vger.kernel.org
Subject: Re: [PATCH net-next] bonding: correct rr balancing during link failure
Date: Tue, 15 Dec 2020 21:54:54 +0000	[thread overview]
Message-ID: <X9kwqvgoAmrjAaXY@black-debian> (raw)
In-Reply-To: <15308.1607463969@famine>

On Tue, Dec 08, 2020 at 01:46:09PM -0800, Jay Vosburgh wrote:
> 
> Jakub Kicinski <kuba@kernel.org> wrote:
> 
> >On Wed, 02 Dec 2020 20:55:57 +0000 Lars Everbrand wrote:
> 	Are these bandwidth numbers from observation of the actual
> behavior?  I'm not sure the real system would behave this way; my
> suspicion is that it would increase the likelihood of drops on the
> overused slave, not that the overall capacity would be limited.
I tested this with with 2 VMs and 5 bridges with bandwidth limitiation
via 'virsh domiftune' to bring the speed down to something similar to 
100Mbit/s.

iperf results:

with patch:
---
working       iperf
interfaces    speed [mbit/s]
5             442
4             363
3             278
2             199
1             107

without patch:
---
working       iperf
interfaces    speed [mbit/s]
5             444
4             226
3             155
2             129
1             107

The speed at 5x100 is not going as high as I expected but the
sub-optimal speed is still visible.

Note that the degradation tested is with downing interfaces sequentially
which is the worst-case for this problem.

> >Looking at the code in question it feels a little like we're breaking
> >abstractions if we bump the counter directly in get_slave_by_id.
> 
> 	Agreed; I think a better way to fix this is to enable the slave
> array for balance-rr mode, and then use the array to find the right
> slave.  This way, we then avoid the problematic "skip unable to tx"
> logic for free.
> 
> >For one thing when the function is called for IGMP packets the counter
> >should not be incremented at all. But also if packets_per_slave is not
> >1 we'd still be hitting the same leg multiple times (packets_per_slave
> >/ 2). So it seems like we should round the counter up somehow?
> >
> >For IGMP maybe we don't have to call bond_get_slave_by_id() at all,
> >IMHO, just find first leg that can TX. Then we can restructure
> >bond_get_slave_by_id() appropriately for the non-IGMP case.
> 
> 	For IGMP, the theory is to confine that traffic to a single
> device.  Normally, this will be curr_active_slave, which is updated even
> in balance-rr mode as interfaces are added to or removed from the bond.
> The call to bond_get_slave_by_id should be a fallback in case
> curr_active_slave is empty, and should be the exception, and may not be
> possible at all.
> 
> 	But either way, the IGMP path shouldn't mess with rr_tx_counter,
> it should be out of band of the normal TX packet counting, so to speak.
> 
> 	-J
> 
> >> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> >> index e0880a3840d7..e02d9c6d40ee 100644
> >> --- a/drivers/net/bonding/bond_main.c
> >> +++ b/drivers/net/bonding/bond_main.c
> >> @@ -4107,6 +4107,7 @@ static struct slave *bond_get_slave_by_id(struct bonding *bond,
> >>  		if (--i < 0) {
> >>  			if (bond_slave_can_tx(slave))
> >>  				return slave;
> >> +			bond->rr_tx_counter++;
> >>  		}
> >>  	}
> >>
> >> @@ -4117,6 +4118,7 @@ static struct slave *bond_get_slave_by_id(struct bonding *bond,
> >>  			break;
> >>  		if (bond_slave_can_tx(slave))
> >>  			return slave;
> >> +		bond->rr_tx_counter++;
> >>  	}
> >>  	/* no slave that can tx has been found */
> >>  	return NULL;
> >
> 
> ---
> 	-Jay Vosburgh, jay.vosburgh@canonical.com


  reply	other threads:[~2020-12-15 21:58 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-02 20:55 [PATCH net-next] bonding: correct rr balancing during link failure Lars Everbrand
2020-12-05 19:45 ` Jakub Kicinski
2020-12-08 21:46   ` Jay Vosburgh
2020-12-15 21:54     ` Lars Everbrand [this message]
2020-12-15 21:32   ` Lars Everbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=X9kwqvgoAmrjAaXY@black-debian \
    --to=lars.everbrand@protonmail.com \
    --cc=andy@greyhouse.net \
    --cc=davem@davemloft.net \
    --cc=jay.vosburgh@canonical.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=vfalico@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.