linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: <Justin.Lee1@Dell.com>
To: <sam@mendozajonas.com>, <netdev@vger.kernel.org>
Cc: <davem@davemloft.net>, <linux-kernel@vger.kernel.org>,
	<openbmc@lists.ozlabs.org>
Subject: RE: [PATCH net-next v3 6/6] net/ncsi: Configure multi-package, multi-channel modes with failover
Date: Fri, 9 Nov 2018 18:07:49 +0000	[thread overview]
Message-ID: <a5595570bc9e41af8deb676d847014bb@AUSX13MPS302.AMER.DELL.COM> (raw)
In-Reply-To: <ce6a678b1a0673050bc8070cb99dd68d1929c227.camel@mendozajonas.com>

Hi Samuel,

The extra patch fixed most issues but I see another corner case.

Thanks,
Justin


> On Thu, 2018-11-08 at 22:48 +0000, Justin.Lee1@Dell.com wrote:
> > Hi Samuel,
> > 
> > For multi-package and multi-channel case, channel seems to be select correctly. Expect that,
> > I still see the timing issue for back-to-back netlink command. Due to that, channel might be
> > set to invisible state. Please refer to ncsi0 and ncsi2 below. The channel state is set to 3.
> > 
> > cat /sys/kernel/debug/ncsi_protocol/ncsi_device_status
> > IFIDX IFNAME NAME   PID CID RX TX MP MC WP WC PC CS PS LS RU CR NQ HA
> > =====================================================================
> >   2   eth2   ncsi0  000 000 1  1  1  1  1  0  0  3  0  1  1  1  0  1
> >   2   eth2   ncsi1  000 001 0  0  1  1  1  0  0  1  0  1  1  1  0  1
> >   2   eth2   ncsi2  001 000 1  0  1  1  1  1  0  3  0  1  1  1  0  1
> >   2   eth2   ncsi3  001 001 1  0  1  1  1  1  0  2  1  1  1  1  0  1
> > =====================================================================
> > MP: Multi-mode Package  WP: Whitelist Package
> > MC: Multi-mode Channel  WC: Whitelist Channel
> > PC: Primary Channel     CS: Channel State IA/A/IV 1/2/3
> > PS: Poll Status         LS: Link Status
> > RU: Running             CR: Carrier OK
> > NQ: Queue Stopped       HA: Hardware Arbitration
> > 
> > The timing issue is not only happening in application. If I use using the following way
> > to send the request, I can see the issue as well. 
> > 
> > ncsi_netlink -l 2 -a 0x01 -m; ncsi_netlink -l 2 -p 0 -b 0x03 -m; ncsi_netlink -l 2 -p 1 -b 0x00 -m;
> > ncsi_netlink -l 2 -a 0x03 -m; ncsi_netlink -l 2 -p 0 -b 0x00 -m; ncsi_netlink -l 2 -p 1 -b 0x03 -m;
> 
> This actually recreates for me as well; I see now what you mean about
> channels getting stuck in the invisible state. I believe I've narrowed
> down the issue. I've pasted an additional patch below if you are able to
> test on your machine.
> 
> > 
> > 
> > Also, there is one issue below for non-multi-package/non-multi-channel case.
> > 
> > Thanks,
> > Justin
> > 
> > 
> > > @@ -1008,32 +1164,49 @@ static int ncsi_choose_active_channel(struct ncsi_dev_priv *ndp)
> > >  
> > >  			ncm = &nc->modes[NCSI_MODE_LINK];
> > >  			if (ncm->data[2] & 0x1) {
> > > -				spin_unlock_irqrestore(&nc->lock, flags);
> > >  				found = nc;
> > > -				goto out;
> > > +				with_link = true;
> > >  			}
> > >  
> > > -			spin_unlock_irqrestore(&nc->lock, flags);
> > > +			/* If multi_channel is enabled configure all valid
> > > +			 * channels whether or not they currently have link
> > > +			 * so they will have AENs enabled.
> > > +			 */
> > > +			if (with_link || np->multi_channel) {
> > > +				spin_lock_irqsave(&ndp->lock, flags);
> > > +				list_add_tail_rcu(&nc->link,
> > > +						  &ndp->channel_queue);
> > > +				spin_unlock_irqrestore(&ndp->lock, flags);
> > > +
> > > +				netdev_dbg(ndp->ndev.dev,
> > > +					   "NCSI: Channel %u added to queue (link %s)\n",
> > > +					   nc->id,
> > > +					   ncm->data[2] & 0x1 ? "up" : "down");
> > > +			}
> > > +
> > > +			spin_unlock_irqrestore(&nc->lock, cflags);
> > > +
> > > +			if (with_link && !np->multi_channel)
> > > +				break;
> > 
> > The line needs to change to "goto found". If not, all channels with link will be added
> > even if the multi-channel is not enabled for that package. The ncsi1 below is enabled.
> > There is no netlink command sent to enable multi-package or multi-channel.
> > 
> > IFIDX IFNAME NAME   PID CID RX TX MP MC WP WC PC CS PS LS RU CR NQ HA
> > =====================================================================
> >   2   eth2   ncsi0  000 000 1  1  0  0  1  1  0  2  1  1  1  1  0  1
> >   2   eth2   ncsi1  000 001 1  0  0  0  1  1  0  2  1  1  1  1  0  1
> >   2   eth2   ncsi2  001 000 0  0  0  0  1  1  0  1  0  1  1  1  0  1
> >   2   eth2   ncsi3  001 001 0  0  0  0  1  1  0  1  0  1  1  1  0  1
> > =====================================================================
> > MP: Multi-mode Package  WP: Whitelist Package
> > MC: Multi-mode Channel  WC: Whitelist Channel
> > PC: Primary Channel     CS: Channel State IA/A/IV 1/2/3
> > PS: Poll Status         LS: Link Status
> > RU: Running             CR: Carrier OK
> > NQ: Queue Stopped       HA: Hardware Arbitration
> > 
> > >  		}
> > > +		if (with_link && !ndp->multi_package)
> > > +			break;
> > >  	}
> > 
> > found:
> 
> This *may* be part of the above issue, I don't see this in normal
> operation. The combination of (with_link && !np->multi_channel) and
> (with_link && !ndp->multi_package) should prevent additional channels
> being added without the need for 'goto found'. Please let me know if you
> still see it with the extra patch.
> 

This one is fixed by your extra patch but I see another corner case. Basically,
the issue is due to the link status is not updating during the process of
choosing the active channel. During the process, maybe we should find a 
chance to send Get Link Status command to all whitelist channels.

Here is my step.
1. All four channels plug Ethernet cable.
2. BMC starts without any multi-package/multi-channel configuration.
3. The ncsi0 channel is selected by NC-SI driver.
4. Remove cable from ncsi0.
5. The ncsi1 channel is selected by NC-SI driver.
6. Remove cable from ncsi1.
7. The ncsi2 channel is selected by NC-SI driver.
8. Insert cable for ncsi0. Link status will not be updated.
9. Remove cable from ncsi2.
10. The ncsi3 channel is selected by NC-SI driver.
11. Remove cable from ncsi3.
12. The selected channel is ncsi3 without link.

cat /sys/kernel/debug/ncsi_protocol/ncsi_device_
IFIDX IFNAME NAME   PID CID RX TX MP MC WP WC PC CS PS LS RU CR NQ HA
=====================================================================
  2   eth2   ncsi0  000 000 0  0  0  0  1  1  0  1  0  0  1  1  0  1
  2   eth2   ncsi1  000 001 0  0  0  0  1  1  0  1  0  0  1  1  0  1
  2   eth2   ncsi2  001 000 0  0  0  0  1  1  0  1  0  0  1  1  0  1
  2   eth2   ncsi3  001 001 1  1  0  0  1  1  0  2  1  0  1  1  0  1
=====================================================================
MP: Multi-mode Package  WP: Whitelist Package
MC: Multi-mode Channel  WC: Whitelist Channel
PC: Primary Channel     CS: Channel State IA/A/IV 1/2/3
PS: Poll Status         LS: Link Status
RU: Running             CR: Carrier OK
NQ: Queue Stopped       HA: Hardware Arbitration

> > 
> > After applying this change, I notice that if there is no link available to BMC when BMC
> > starts, NC-SI can't properly configure channel once I plug in the Ethernet cable. 
> > 
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_aen_handler_lsc() - pkg 0 ch 0 state up
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_aen_handler_lsc() - had_link 0, has_link 1, chained 0
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_stop_channel_monitor() - pkg 0 ch 0
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_process_next_channel()
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_process_next_channel() - pkg 0 ch 0 INVISIBLE
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_process_next_channel() - suspending pkg 0 ch 0
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_suspend_channel() - pkg 0 ch 0 state 0400 select
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_dev_work()
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_suspend_channel() - pkg 0 ch 0 state 0403 dc
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_dev_work()
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_suspend_channel() - pkg 0 ch 0 state 0404 deselect
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_dev_work()
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_suspend_channel() - pkg 0 ch 0 state 0405 done
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_rsp_handler_dp() - pkg 0 ch 0 INACTIVE
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_rsp_handler_dp() - pkg 0 ch 1 INACTIVE
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_dev_work()
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_suspend_channel() - pkg 0 ch 0 state 0406 deselect
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_suspend_channel() - pkg 0 ch 0 INACTIVE
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_process_next_channel()
> > npcm7xx-emc f0825000.eth eth2: NCSI: ncsi_process_next_channel() - No more channels to process
> > npcm7xx-emc f0825000.eth eth2: NCSI interface down
> 
> Good find, there was a corner case in the LSC AEN handler changes that
> led to this, I've fixed this in the patch as well. Thanks for testing!

This one is fixed.

> 
> 
> From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
> From: Samuel Mendoza-Jonas <sam@mendozajonas.com>
> Date: Fri, 9 Nov 2018 13:11:03 +1100
> Subject: [PATCH] net/ncsi: Reset state fixes, single-channel LSC
> 
> ---
>  net/ncsi/ncsi-aen.c    |  8 +++++---
>  net/ncsi/ncsi-manage.c | 19 +++++++++++++++----
>  2 files changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/net/ncsi/ncsi-aen.c b/net/ncsi/ncsi-aen.c
> index 39c2e9eea2ba..034cb1dc5566 100644
> --- a/net/ncsi/ncsi-aen.c
> +++ b/net/ncsi/ncsi-aen.c
> @@ -93,14 +93,16 @@ static int ncsi_aen_handler_lsc(struct ncsi_dev_priv *ndp,
>  	if ((had_link == has_link) || chained)
>  		return 0;
>  
> -	if (!ndp->multi_package && !nc->package->multi_channel) {
> -		if (had_link)
> -			ndp->flags |= NCSI_DEV_RESHUFFLE;
> +	if (!ndp->multi_package && !nc->package->multi_channel && had_link) {
> +		ndp->flags |= NCSI_DEV_RESHUFFLE;
>  		ncsi_stop_channel_monitor(nc);
>  		spin_lock_irqsave(&ndp->lock, flags);
>  		list_add_tail_rcu(&nc->link, &ndp->channel_queue);
>  		spin_unlock_irqrestore(&ndp->lock, flags);
>  		return ncsi_process_next_channel(ndp);
> +	} else {
> +		/* Configured channel came up */
> +		return 0;
>  	}
>  
>  	if (had_link) {
> diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
> index fa3c2144f5ba..92e59f07f9a7 100644
> --- a/net/ncsi/ncsi-manage.c
> +++ b/net/ncsi/ncsi-manage.c
> @@ -1063,17 +1063,17 @@ static void ncsi_configure_channel(struct ncsi_dev_priv *ndp)
>  	case ncsi_dev_state_config_done:
>  		netdev_dbg(ndp->ndev.dev, "NCSI: channel %u config done\n",
>  			   nc->id);
> +		spin_lock_irqsave(&nc->lock, flags);
> +		nc->state = NCSI_CHANNEL_ACTIVE;
> +
>  		if (ndp->flags & NCSI_DEV_RESET) {
>  			/* A reset event happened during config, start it now */
> -			spin_lock_irqsave(&nc->lock, flags);
>  			nc->reconfigure_needed = false;
>  			spin_unlock_irqrestore(&nc->lock, flags);
> -			nd->state = ncsi_dev_state_functional;
>  			ncsi_reset_dev(nd);
>  			break;
>  		}
>  
> -		spin_lock_irqsave(&nc->lock, flags);
>  		if (nc->reconfigure_needed) {
>  			/* This channel's configuration has been updated
>  			 * part-way during the config state - start the
> @@ -1092,7 +1092,6 @@ static void ncsi_configure_channel(struct ncsi_dev_priv *ndp)
>  			break;
>  		}
>  
> -		nc->state = NCSI_CHANNEL_ACTIVE;
>  		if (nc->modes[NCSI_MODE_LINK].data[2] & 0x1) {
>  			hot_nc = nc;
>  		} else {
> @@ -1803,6 +1802,18 @@ int ncsi_reset_dev(struct ncsi_dev *nd)
>  			spin_unlock_irqrestore(&ndp->lock, flags);
>  			return 0;
>  		}
> +	} else {
> +		switch (nd->state) {
> +		case ncsi_dev_state_suspend_done:
> +		case ncsi_dev_state_config_done:
> +		case ncsi_dev_state_functional:
> +			/* Ok */
> +			break;
> +		default:
> +			/* Current reset operation happening */
> +			spin_unlock_irqrestore(&ndp->lock, flags);
> +			return 0;
> +		}
>  	}
>  
>  	if (!list_empty(&ndp->channel_queue)) {
> -- 
> 2.19.1




  reply	other threads:[~2018-11-09 18:07 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-08  2:49 [PATCH net-next v3 0/6] net/ncsi: Allow enabling multiple packages & channels Samuel Mendoza-Jonas
2018-11-08  2:49 ` [PATCH net-next v3 1/6] net/ncsi: Don't enable all channels when HWA available Samuel Mendoza-Jonas
2018-11-08  2:49 ` [PATCH net-next v3 2/6] net/ncsi: Probe single packages to avoid conflict Samuel Mendoza-Jonas
2018-11-08  2:49 ` [PATCH net-next v3 3/6] net/ncsi: Don't deselect package in suspend if active Samuel Mendoza-Jonas
2018-11-08  2:49 ` [PATCH net-next v3 4/6] net/ncsi: Don't mark configured channels inactive Samuel Mendoza-Jonas
2018-11-08  2:49 ` [PATCH net-next v3 5/6] net/ncsi: Reset channel state in ncsi_start_dev() Samuel Mendoza-Jonas
2018-11-08  2:49 ` [PATCH net-next v3 6/6] net/ncsi: Configure multi-package, multi-channel modes with failover Samuel Mendoza-Jonas
2018-11-08 22:48   ` Justin.Lee1
2018-11-09  2:17     ` Samuel Mendoza-Jonas
2018-11-09 18:07       ` Justin.Lee1 [this message]
2018-11-09 21:58       ` Justin.Lee1
2018-11-12  0:38         ` Samuel Mendoza-Jonas
2018-11-13 18:00           ` Justin.Lee1

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a5595570bc9e41af8deb676d847014bb@AUSX13MPS302.AMER.DELL.COM \
    --to=justin.lee1@dell.com \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=openbmc@lists.ozlabs.org \
    --cc=sam@mendozajonas.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).