All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] port(belong to a lacp aggregate group)cannot be unselected while the connected port failed
@ 2017-02-09  9:19 zhou zhengwu
  2017-02-10  6:48 ` zhou zhengwu
  0 siblings, 1 reply; 2+ messages in thread
From: zhou zhengwu @ 2017-02-09  9:19 UTC (permalink / raw)
  To: Jay Vosburgh, Andy Gospodarek; +Cc: linux-kernel

Hi

[1] Testing OS: SUSE 11SP3( kernel version 3.0.93)

[2]Problem description:
    ServerA [bond] ----------- [bond] ServerB
    Two servers are connected with bonding interfaces LACP is enabled.
    Firstly, it works well.
    Then, one port (for example, port B) in serverB failed, it can send 
but not receive packets including the LACPDU however the port status is 
still UP.
    Result:
    In serverB, port B is unselected and traffic is sent through other 
aggregate port.
    While In serverA, port A connecting with port B of serverB is still 
selected by LACP and traffic is still sent through it

[3]Two question about the implementation of LACP RX machine:
   1. Following judging condition marked with “*” is reasonable?

static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port)
{
      …………………….
                    case AD_RX_CURRENT:
                             // detect loopback situation
                             if 
(!MAC_ADDRESS_COMPARE(&(lacpdu->actor_system), &(port->actor_system))) {
                                      // INFO_RECEIVED_LOOPBACK_FRAMES
                                      pr_err("%s: An illegal loopback 
occurred on adapter (%s).\n"
                                             "Check the configuration to 
verify that all adapters are connected to 802.3ad compliant switch ports\n",
 
port->slave->dev->master->name, port->slave->dev->name);
                                      return;
                             }
                             __update_selected(lacpdu, port);
                             __update_ntt(lacpdu, port);
                             __record_pdu(lacpdu, port);
                             port->sm_rx_timer_counter = 
__ad_timer_to_ticks(AD_CURRENT_WHILE_TIMER, 
(u16)(port->actor_oper_port_state & AD_STATE_LACP_TIMEOUT));
                             port->actor_oper_port_state &= 
~AD_STATE_EXPIRED;
                             // verify that if the aggregator is 
enabled, the port is enabled too.
                             //(because if the link goes down for a 
short time, the 802.3ad will not
                             // catch it, and the port will continue to 
be disabled)
          *****                if (port->aggregator
          *****           && port->aggregator->is_active
          *****           && !__port_is_enabled(port))
          *****                __enable_port(port); 
-------------------- is the judging condition reasonable?
                             break;
        …………………….
}

    2. While lacp rx machine receive the lacpdu packets, whether the 
port should be selected or not should be done. Currently,if lacp rx 
machine is in AD_RX_CURRENT and if the partner's configuration is 
unchanged, port will still be selected. However, it is not reasonable in 
some conditions. For example, partner port failed which can send but not 
receive pkts. At the same time, the port still be UP.

static void __update_selected(struct lacpdu *lacpdu, struct port *port)
{
	if (lacpdu && port) {
		const struct port_params *partner = &port->partner_oper;

		// check if any parameter is different
		if (ntohs(lacpdu->actor_port) != partner->port_number ||
		    ntohs(lacpdu->actor_port_priority) != partner->port_priority ||
		    MAC_ADDRESS_COMPARE(&lacpdu->actor_system, &partner->system) ||
		    ntohs(lacpdu->actor_system_priority) != partner->system_priority ||
		    ntohs(lacpdu->actor_key) != partner->key ||
		    (lacpdu->actor_state & AD_STATE_AGGREGATION) != 
(partner->port_state & AD_STATE_AGGREGATION)) {
			// update the state machine Selected variable
			port->sm_vars &= ~AD_PORT_SELECTED;
		}
	}
}

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [BUG] port(belong to a lacp aggregate group)cannot be unselected while the connected port failed
  2017-02-09  9:19 [BUG] port(belong to a lacp aggregate group)cannot be unselected while the connected port failed zhou zhengwu
@ 2017-02-10  6:48 ` zhou zhengwu
  0 siblings, 0 replies; 2+ messages in thread
From: zhou zhengwu @ 2017-02-10  6:48 UTC (permalink / raw)
  To: Jay Vosburgh, Andy Gospodarek; +Cc: linux-kernel

ad_rx_machine() -> __update_selected(), this funcition will check if any 
lacp parameter is different. If any change happened, related port will 
be unselected.

However, this function will not check the changing of port state, for 
example, whether partner port's port state is synchronization, which 
will make lacp state machine fail.


On 2017/2/9 17:19, zhou zhengwu wrote:
> Hi
>
> [1] Testing OS: SUSE 11SP3( kernel version 3.0.93)
>
> [2]Problem description:
>    ServerA [bond] ----------- [bond] ServerB
>    Two servers are connected with bonding interfaces LACP is enabled.
>    Firstly, it works well.
>    Then, one port (for example, port B) in serverB failed, it can send
> but not receive packets including the LACPDU however the port status is
> still UP.
>    Result:
>    In serverB, port B is unselected and traffic is sent through other
> aggregate port.
>    While In serverA, port A connecting with port B of serverB is still
> selected by LACP and traffic is still sent through it
>
> [3]Two question about the implementation of LACP RX machine:
>   1. Following judging condition marked with “*” is reasonable?
>
> static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port)
> {
>      …………………….
>                    case AD_RX_CURRENT:
>                             // detect loopback situation
>                             if
> (!MAC_ADDRESS_COMPARE(&(lacpdu->actor_system), &(port->actor_system))) {
>                                      // INFO_RECEIVED_LOOPBACK_FRAMES
>                                      pr_err("%s: An illegal loopback
> occurred on adapter (%s).\n"
>                                             "Check the configuration to
> verify that all adapters are connected to 802.3ad compliant switch
> ports\n",
>
> port->slave->dev->master->name, port->slave->dev->name);
>                                      return;
>                             }
>                             __update_selected(lacpdu, port);
>                             __update_ntt(lacpdu, port);
>                             __record_pdu(lacpdu, port);
>                             port->sm_rx_timer_counter =
> __ad_timer_to_ticks(AD_CURRENT_WHILE_TIMER,
> (u16)(port->actor_oper_port_state & AD_STATE_LACP_TIMEOUT));
>                             port->actor_oper_port_state &=
> ~AD_STATE_EXPIRED;
>                             // verify that if the aggregator is enabled,
> the port is enabled too.
>                             //(because if the link goes down for a short
> time, the 802.3ad will not
>                             // catch it, and the port will continue to
> be disabled)
>          *****                if (port->aggregator
>          *****           && port->aggregator->is_active
>          *****           && !__port_is_enabled(port))
>          *****                __enable_port(port); --------------------
> is the judging condition reasonable?
>                             break;
>        …………………….
> }
>
>    2. While lacp rx machine receive the lacpdu packets, whether the port
> should be selected or not should be done. Currently,if lacp rx machine
> is in AD_RX_CURRENT and if the partner's configuration is unchanged,
> port will still be selected. However, it is not reasonable in some
> conditions. For example, partner port failed which can send but not
> receive pkts. At the same time, the port still be UP.
>
> static void __update_selected(struct lacpdu *lacpdu, struct port *port)
> {
>     if (lacpdu && port) {
>         const struct port_params *partner = &port->partner_oper;
>
>         // check if any parameter is different
>         if (ntohs(lacpdu->actor_port) != partner->port_number ||
>             ntohs(lacpdu->actor_port_priority) != partner->port_priority ||
>             MAC_ADDRESS_COMPARE(&lacpdu->actor_system, &partner->system) ||
>             ntohs(lacpdu->actor_system_priority) !=
> partner->system_priority ||
>             ntohs(lacpdu->actor_key) != partner->key ||
>             (lacpdu->actor_state & AD_STATE_AGGREGATION) !=
> (partner->port_state & AD_STATE_AGGREGATION)) {
>             // update the state machine Selected variable
>             port->sm_vars &= ~AD_PORT_SELECTED;
>         }
>     }
> }

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-02-10  6:56 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-09  9:19 [BUG] port(belong to a lacp aggregate group)cannot be unselected while the connected port failed zhou zhengwu
2017-02-10  6:48 ` zhou zhengwu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.