netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jay Vosburgh <jay.vosburgh@canonical.com>
To: =?UTF-8?B?TWFoZXNoIEJhbmRld2FyICjgpK7gpLngpYfgpLYg4KSs4KSC4KSh4KWH4KS14KS+4KSwKQ==?=
	<maheshb@google.com>
Cc: Andy Gospodarek <andy@greyhouse.net>,
	Veaceslav Falico <vfalico@gmail.com>,
	David Miller <davem@davemloft.net>,
	Netdev <netdev@vger.kernel.org>,
	Mahesh Bandewar <mahesh@bandewar.net>
Subject: Re: [PATCH net] bonding: fix active-backup transition after link failure
Date: Wed, 11 Dec 2019 22:38:06 -0800	[thread overview]
Message-ID: <26918.1576132686@famine> (raw)
In-Reply-To: <CAF2d9jgjeky0eMgwFZKHO_RLTBNstH1gCq4hn1FfO=TtrMP1ow@mail.gmail.com>

Mahesh Bandewar (महेश बंडेवार) wrote:

>On Sat, Dec 7, 2019 at 2:09 PM Jay Vosburgh <jay.vosburgh@canonical.com> wrote:
>>
>> Mahesh Bandewar <maheshb@google.com> wrote:
>>
>> >After the recent fix 1899bb325149 ("bonding: fix state transition
>> >issue in link monitoring"), the active-backup mode with miimon
>> >initially come-up fine but after a link-failure, both members
>> >transition into backup state.
>> >
>> >Following steps to reproduce the scenario (eth1 and eth2 are the
>> >slaves of the bond):
>> >
>> >    ip link set eth1 up
>> >    ip link set eth2 down
>> >    sleep 1
>> >    ip link set eth2 up
>> >    ip link set eth1 down
>> >    cat /sys/class/net/eth1/bonding_slave/state
>> >    cat /sys/class/net/eth2/bonding_slave/state
>> >
>> >Fixes: 1899bb325149 ("bonding: fix state transition issue in link monitoring")
>> >CC: Jay Vosburgh <jay.vosburgh@canonical.com>
>> >Signed-off-by: Mahesh Bandewar <maheshb@google.com>
>> >---
>> > drivers/net/bonding/bond_main.c | 3 ---
>> > 1 file changed, 3 deletions(-)
>> >
>> >diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> >index fcb7c2f7f001..ad9906c102b4 100644
>> >--- a/drivers/net/bonding/bond_main.c
>> >+++ b/drivers/net/bonding/bond_main.c
>> >@@ -2272,9 +2272,6 @@ static void bond_miimon_commit(struct bonding *bond)
>> >                       } else if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) {
>> >                               /* make it immediately active */
>> >                               bond_set_active_slave(slave);
>> >-                      } else if (slave != primary) {
>> >-                              /* prevent it from being the active one */
>> >-                              bond_set_backup_slave(slave);
>>
>>         How does this fix things?  Doesn't bond_select_active_slave() ->
>> bond_change_active_slave() set the backup flag correctly via a call to
>> bond_set_slave_active_flags() when it sets a slave to be the active
>> slave?  If this change resolves the problem, I'm not sure how this ever
>> worked correctly, even prior to 1899bb325149.
>>
>Hi Jay, I used kprobes to figure out the brokenness this patch fixes.
>Prior to your patch this call would not happen but with the patch,
>this extra call will put the master into the backup mode erroneously
>(in fact both members would be in backup state). The mechanics you
>have mentioned works correctly except that in the prior case, the
>switch statement was using new_link which was not same as
>link_new_state. The miimon_inspect will update new_link which is what
>was used in miimon_commit code. The link_new_state was used only to
>mitigate the rtnl-lock issue which would update the "link". Hence in
>the prior code, this path would never get executed.

	I'm looking at the old code (prior to 1899bb325149), and I don't
see a path to what you're describing for the down to up transition in
active-backup mode.

bond_miimon_inspect enters switch, slave->link == BOND_LINK_DOWN.

link_state is nonzero, call bond_propose_link_state(BOND_LINK_BACK),
which sets slave->link_new_state to _BACK.

Fall through to BOND_LINK_BACK case, set slave->new_link = BOND_LINK_UP

bond_mii_monitor then calls bond_commit_link_state, which sets
slave->link to BOND_LINK_BACK

Enter bond_miimon_commit switch (new_link), which is BOND_LINK_UP

In "case BOND_LINK_UP:" there is no way out of this block, and it should
proceed to call bond_set_backup_slave for active-backup mode every time.

>The steps to reproduce this issue is straightforward and happens 100%
>of the time (I used two mlx interfaces but that shouldn't matter).

	Yes, I've been able to reproduce it locally (with igb, FWIW).  I
think the patch is likely ok, I'm just mystified as to how the backup
setting could have worked prior to 1899bb325149, so perhaps the Fixes
tag doesn't go back far enough.

	-J

>thanks,
>--mahesh..
>>         -J
>>
>> >                       }
>> >
>> >                       slave_info(bond->dev, slave->dev, "link status definitely up, %u Mbps %s duplex\n",
>> >--
>> >2.24.0.393.g34dc348eaf-goog

---
	-Jay Vosburgh, jay.vosburgh@canonical.com

  parent reply	other threads:[~2019-12-12  6:39 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-06 23:44 [PATCH net] bonding: fix active-backup transition after link failure Mahesh Bandewar
2019-12-07 22:09 ` Jay Vosburgh
2019-12-09 18:41   ` Mahesh Bandewar (महेश बंडेवार)
2019-12-11 20:10     ` Mahesh Bandewar (महेश बंडेवार)
2019-12-12  6:38     ` Jay Vosburgh [this message]
2019-12-12 18:28       ` Mahesh Bandewar (महेश बंडेवार)
2019-12-13 20:28         ` Jay Vosburgh
2019-12-15  0:29           ` Jakub Kicinski
2019-12-15 20:18             ` Mahesh Bandewar (महेश बंडेवार)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=26918.1576132686@famine \
    --to=jay.vosburgh@canonical.com \
    --cc=andy@greyhouse.net \
    --cc=davem@davemloft.net \
    --cc=mahesh@bandewar.net \
    --cc=maheshb@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=vfalico@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).