All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Toppins <jtoppins@redhat.com>
To: Jay Vosburgh <jay.vosburgh@canonical.com>
Cc: netdev@vger.kernel.org, liuhangbin@gmail.com,
	Veaceslav Falico <vfalico@gmail.com>,
	Andy Gospodarek <andy@greyhouse.net>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH net v3 2/2] bonding: 802.3ad: fix no transmission of LACPDUs
Date: Tue, 16 Aug 2022 09:41:25 -0400	[thread overview]
Message-ID: <3d55690a-8932-4560-4267-ab28816fdb47@redhat.com> (raw)
In-Reply-To: <17000.1660655501@famine>

On 8/16/22 09:11, Jay Vosburgh wrote:
> Jonathan Toppins <jtoppins@redhat.com> wrote:
> 
>> This is caused by the global variable ad_ticks_per_sec being zero as
>> demonstrated by the reproducer script discussed below. This causes
>> all timer values in __ad_timer_to_ticks to be zero, resulting
>> in the periodic timer to never fire.
>>
>> To reproduce:
>> Run the script in
>> `tools/testing/selftests/drivers/net/bonding/bond-break-lacpdu-tx.sh` which
>> puts bonding into a state where it never transmits LACPDUs.
>>
>> line 44: ip link add fbond type bond mode 4 miimon 200 \
>>             xmit_hash_policy 1 ad_actor_sys_prio 65535 lacp_rate fast
>> setting bond param: ad_actor_sys_prio
>> given:
>>     params.ad_actor_system = 0
>> call stack:
>>     bond_option_ad_actor_sys_prio()
>>     -> bond_3ad_update_ad_actor_settings()
>>        -> set ad.system.sys_priority = bond->params.ad_actor_sys_prio
>>        -> ad.system.sys_mac_addr = bond->dev->dev_addr; because
>>             params.ad_actor_system == 0
>> results:
>>      ad.system.sys_mac_addr = bond->dev->dev_addr
>>
>> line 48: ip link set fbond address 52:54:00:3B:7C:A6
>> setting bond MAC addr
>> call stack:
>>     bond->dev->dev_addr = new_mac
>>
>> line 52: ip link set fbond type bond ad_actor_sys_prio 65535
>> setting bond param: ad_actor_sys_prio
>> given:
>>     params.ad_actor_system = 0
>> call stack:
>>     bond_option_ad_actor_sys_prio()
>>     -> bond_3ad_update_ad_actor_settings()
>>        -> set ad.system.sys_priority = bond->params.ad_actor_sys_prio
>>        -> ad.system.sys_mac_addr = bond->dev->dev_addr; because
>>             params.ad_actor_system == 0
>> results:
>>      ad.system.sys_mac_addr = bond->dev->dev_addr
>>
>> line 60: ip link set veth1-bond down master fbond
>> given:
>>     params.ad_actor_system = 0
>>     params.mode = BOND_MODE_8023AD
>>     ad.system.sys_mac_addr == bond->dev->dev_addr
>> call stack:
>>     bond_enslave
>>     -> bond_3ad_initialize(); because first slave
>>        -> if ad.system.sys_mac_addr != bond->dev->dev_addr
>>           return
>> results:
>>      Nothing is run in bond_3ad_initialize() because dev_add equals
>>      sys_mac_addr leaving the global ad_ticks_per_sec zero as it is
>>      never initialized anywhere else.
>>
>> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
>> Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
>> ---
>>
>> Notes:
>>     v2:
>>      * split this fix from the reproducer
>>     v3:
>>      * rebased to latest net/master
>>
>> drivers/net/bonding/bond_3ad.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>> index d7fb33c078e8..957d30db6f95 100644
>> --- a/drivers/net/bonding/bond_3ad.c
>> +++ b/drivers/net/bonding/bond_3ad.c
>> @@ -84,7 +84,8 @@ enum ad_link_speed_type {
>> static const u8 null_mac_addr[ETH_ALEN + 2] __long_aligned = {
>> 	0, 0, 0, 0, 0, 0
>> };
>> -static u16 ad_ticks_per_sec;
>> +
>> +static u16 ad_ticks_per_sec = 1000 / AD_TIMER_INTERVAL;
>> static const int ad_delta_in_ticks = (AD_TIMER_INTERVAL * HZ) / 1000;
> 
> 	I still feel like this is kind of a hack, as it's not really
> fixing bond_3ad_initialize to actually work (which is the real problem
> as I understand it).  If it's ok to skip all that for this case, then
> why do we ever need to call bond_3ad_initialize?
> 

The way it is currently written you still need to call 
bond_3ad_initialize() just not for setting the tick resolution. The 
issue here is ad_ticks_per_sec is used in several places to calculate 
timer periods, __ad_timer_to_ticks(), for various timers in the 802.3ad 
protocol. And if this variable, ad_ticks_per_sec, is left uninitialized 
all of these timer periods go to zero. Since the value passed in 
bond_3ad_initialize() is an immediate value I simply moved it off of the 
call stack and set the static global variable instead.

To fix bond_3ad_initialize(), probably something like the below is 
needed, but I do not understand why the guard if check was placed in 
bond_3ad_initialize().

diff --git i/drivers/net/bonding/bond_3ad.c w/drivers/net/bonding/bond_3ad.c
index d7fb33c078e8..5b5146f5c4ea 100644
--- i/drivers/net/bonding/bond_3ad.c
+++ w/drivers/net/bonding/bond_3ad.c
@@ -2005,32 +2005,21 @@ void bond_3ad_initiate_agg_selection(struct 
bonding *bond, int timeout)
   *
   * Can be called only after the mac address of the bond is set.
   */
-void bond_3ad_initialize(struct bonding *bond, u16 tick_resolution)
+void bond_3ad_initialize(struct bonding *bond)
  {
-	/* check that the bond is not initialized yet */
-	if (!MAC_ADDRESS_EQUAL(&(BOND_AD_INFO(bond).system.sys_mac_addr),
-				bond->dev->dev_addr)) {
-
-		BOND_AD_INFO(bond).aggregator_identifier = 0;
-
-		BOND_AD_INFO(bond).system.sys_priority =
-			bond->params.ad_actor_sys_prio;
-		if (is_zero_ether_addr(bond->params.ad_actor_system))
-			BOND_AD_INFO(bond).system.sys_mac_addr =
-			    *((struct mac_addr *)bond->dev->dev_addr);
-		else
-			BOND_AD_INFO(bond).system.sys_mac_addr =
-			    *((struct mac_addr *)bond->params.ad_actor_system);
-
-		/* initialize how many times this module is called in one
-		 * second (should be about every 100ms)
-		 */
-		ad_ticks_per_sec = tick_resolution;
+	BOND_AD_INFO(bond).aggregator_identifier = 0;
+	BOND_AD_INFO(bond).system.sys_priority =
+		bond->params.ad_actor_sys_prio;
+	if (is_zero_ether_addr(bond->params.ad_actor_system))
+		BOND_AD_INFO(bond).system.sys_mac_addr =
+		    *((struct mac_addr *)bond->dev->dev_addr);
+	else
+		BOND_AD_INFO(bond).system.sys_mac_addr =
+		    *((struct mac_addr *)bond->params.ad_actor_system);

-		bond_3ad_initiate_agg_selection(bond,
-						AD_AGGREGATOR_SELECTION_TIMER *
-						ad_ticks_per_sec);
-	}
+	bond_3ad_initiate_agg_selection(bond,
+					AD_AGGREGATOR_SELECTION_TIMER *
+					ad_ticks_per_sec);
  }

  /**
diff --git i/drivers/net/bonding/bond_main.c 
w/drivers/net/bonding/bond_main.c
index 50e60843020c..5f56af9dc3ba 100644
--- i/drivers/net/bonding/bond_main.c
+++ w/drivers/net/bonding/bond_main.c
@@ -2078,10 +2078,10 @@ int bond_enslave(struct net_device *bond_dev, 
struct net_device *slave_dev,
  		/* if this is the first slave */
  		if (!prev_slave) {
  			SLAVE_AD_INFO(new_slave)->id = 1;
-			/* Initialize AD with the number of times that the AD timer is 
called in 1 second
-			 * can be called only after the mac address of the bond is set
+			/* can be called only after the mac address of the
+			 * bond is set
  			 */
-			bond_3ad_initialize(bond, 1000/AD_TIMER_INTERVAL);
+			bond_3ad_initialize(bond);
  		} else {
  			SLAVE_AD_INFO(new_slave)->id =
  				SLAVE_AD_INFO(prev_slave)->id + 1;
diff --git i/include/net/bond_3ad.h w/include/net/bond_3ad.h
index 184105d68294..be2992e6de5d 100644
--- i/include/net/bond_3ad.h
+++ w/include/net/bond_3ad.h
@@ -290,7 +290,7 @@ static inline const char 
*bond_3ad_churn_desc(churn_state_t state)
  }

  /* ========== AD Exported functions to the main bonding code ========== */
-void bond_3ad_initialize(struct bonding *bond, u16 tick_resolution);
+void bond_3ad_initialize(struct bonding *bond);
  void bond_3ad_bind_slave(struct slave *slave);
  void bond_3ad_unbind_slave(struct slave *slave);
  void bond_3ad_state_machine_handler(struct work_struct *);

-Jon


  reply	other threads:[~2022-08-16 13:41 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-15 15:08 [PATCH net v3 0/2] bonding: 802.3ad: fix no transmission of LACPDUs Jonathan Toppins
2022-08-15 15:08 ` [PATCH net v3 1/2] selftests: include bonding tests into the kselftest infra Jonathan Toppins
2022-08-16  7:28   ` Hangbin Liu
2022-08-15 15:08 ` [PATCH net v3 2/2] bonding: 802.3ad: fix no transmission of LACPDUs Jonathan Toppins
2022-08-16  7:29   ` Hangbin Liu
2022-08-16 13:11   ` Jay Vosburgh
2022-08-16 13:41     ` Jonathan Toppins [this message]
2022-08-16 17:47       ` Jonathan Toppins
2022-08-17 23:24         ` Jay Vosburgh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3d55690a-8932-4560-4267-ab28816fdb47@redhat.com \
    --to=jtoppins@redhat.com \
    --cc=andy@greyhouse.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jay.vosburgh@canonical.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liuhangbin@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=vfalico@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.