All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] LACP state machine will find nothing and do nothing when link is single passing
@ 2017-02-16  9:20 zhou zhengwu
  2017-02-17  8:46 ` zhou zhengwu
  0 siblings, 1 reply; 2+ messages in thread
From: zhou zhengwu @ 2017-02-16  9:20 UTC (permalink / raw)
  To: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek; +Cc: netdev

Hi

[1]Problem description:
    ServerA [bond] ----------- [bond] ServerB
    I have two servers which connected with bonding interfaces LACP 
enabled. Two bonding interfaces are configured like following:
    insmod ./bonding.ko mode=4 lacp_rate=1 all_slaves_active=1
    echo +trunk1 > /sys/class/net/bonding_masters
    echo +eth0 > /sys/class/net/trunk1/bonding/slaves
    echo +eth1 > /sys/class/net/trunk1/bonding/slaves

    Firstly, it works well.
    Then, one port (for example, eth0) in serverB failed, it can just 
send but not receive packets, however it still is UP.
    Result:
    In serverB, lcap is work well.
    While In serverA, eth0 connecting with eth0 in serverB is in the 
same lacp aggregator with eth1 and traffic is still sent through it.

[2]From the lacp implementation, we can see:
    In above condition, eth0 in server A will receive lacpdu which show 
that AD_STATE_SYNCHRONIZATION is unset in actor_state to indicate there 
is something wrong in eth0 of server B. However, our implementation of 
lacp will just check the port parameters and not check the port state.
If the peer port state changes, local lacp state machine will not find 
which will cause the lacp state machine works wrong.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [BUG] LACP state machine will find nothing and do nothing when link is single passing
  2017-02-16  9:20 [BUG] LACP state machine will find nothing and do nothing when link is single passing zhou zhengwu
@ 2017-02-17  8:46 ` zhou zhengwu
  0 siblings, 0 replies; 2+ messages in thread
From: zhou zhengwu @ 2017-02-17  8:46 UTC (permalink / raw)
  To: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek
  Cc: netdev, qianguoxin, chenhuan.chenhuan

Obviously, in the case of single link, one end of port can receives 
lacpdu and the lacp configure parameters will not change, eg. port 
number, port priority and system priority and so on.  So lacp state 
machine will not check the link change and the failed link will not be 
unselected from the active aggregator.

I add an additional check for the port state when lacpdu is processed in 
rx state machine. The function is ad_rx_machine()->__update_selected(). 
Then everything work well.

Following is the detail modification:

diff -c5 bond_3ad.c bond_3ad.c

static void __update_selected(struct lacpdu *lacpdu, struct port *port)
{
		    ........
   		    ntohs(lacpdu->actor_port_priority) != partner->port_priority ||
   		    MAC_ADDRESS_COMPARE(&lacpdu->actor_system, &partner->system) ||
   		    ntohs(lacpdu->actor_system_priority) != partner->system_priority ||
   		    ntohs(lacpdu->actor_key) != partner->key ||
   		    (lacpdu->actor_state & AD_STATE_AGGREGATION) != 
(partner->port_state & AD_STATE_AGGREGATION)
+ || (lacpdu->actor_state & AD_STATE_SYNCHRONIZATION) != 
(partner->port_state & AD_STATE_SYNCHRONIZATION)) {
   			// update the state machine Selected variable
   			port->sm_vars &= ~AD_PORT_SELECTED;
   		}
   	}
   }

***************
static void ad_port_selection_logic(struct port *port)
{
             ..........
   		     (aggregator->partner_system_priority == 
port->partner_oper.system_priority) &&
   		     (aggregator->partner_oper_aggregator_key == 
port->partner_oper.key)
   		    ) &&
   		    ((MAC_ADDRESS_COMPARE(&(port->partner_oper.system), 
&(null_mac_addr)) && // partner answers
   		      !aggregator->is_individual)  // but is not individual OR
+ 		    ) && (port->partner_oper.port_state & AD_STATE_SYNCHRONIZATION)
   		   ) {
   			// attach to the founded aggregator
   			port->aggregator = aggregator;
   			port->actor_port_aggregator_identifier =
   				port->aggregator->aggregator_identifier;
		

On 2017/2/16 17:20, zhou zhengwu wrote:
> Hi
>
> [1]Problem description:
>    ServerA [bond] ----------- [bond] ServerB
>    I have two servers which connected with bonding interfaces LACP
> enabled. Two bonding interfaces are configured like following:
>    insmod ./bonding.ko mode=4 lacp_rate=1 all_slaves_active=1
>    echo +trunk1 > /sys/class/net/bonding_masters
>    echo +eth0 > /sys/class/net/trunk1/bonding/slaves
>    echo +eth1 > /sys/class/net/trunk1/bonding/slaves
>
>    Firstly, it works well.
>    Then, one port (for example, eth0) in serverB failed, it can just
> send but not receive packets, however it still is UP.
>    Result:
>    In serverB, lcap is work well.
>    While In serverA, eth0 connecting with eth0 in serverB is in the same
> lacp aggregator with eth1 and traffic is still sent through it.
>
> [2]From the lacp implementation, we can see:
>    In above condition, eth0 in server A will receive lacpdu which show
> that AD_STATE_SYNCHRONIZATION is unset in actor_state to indicate there
> is something wrong in eth0 of server B. However, our implementation of
> lacp will just check the port parameters and not check the port state.
> If the peer port state changes, local lacp state machine will not find
> which will cause the lacp state machine works wrong.
>
>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-02-17  8:46 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-16  9:20 [BUG] LACP state machine will find nothing and do nothing when link is single passing zhou zhengwu
2017-02-17  8:46 ` zhou zhengwu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.