All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Estrin, Alex" <alex.estrin-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
To: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	"Dalessandro,
	Dennis"
	<dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	"jgg-uk2M96/98Pc@public.gmane.org"
	<jgg-uk2M96/98Pc@public.gmane.org>
Cc: "linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"Marciniszyn,
	Mike" <mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	"Weiny, Ira" <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Subject: RE: [PATCH for-next 16/16] IB/ipoib: Fix for potential no-carrier state
Date: Fri, 26 Jan 2018 17:18:58 +0000	[thread overview]
Message-ID: <F3529576D8E232409F431C309E29399336CFBE19@CRSMSX102.amr.corp.intel.com> (raw)
In-Reply-To: <1516986011.27592.207.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

> -----Original Message-----
> From: Doug Ledford [mailto:dledford@redhat.com]
> Sent: Friday, January 26, 2018 12:00 PM
> To: Dalessandro, Dennis <dennis.dalessandro@intel.com>; jgg@ziepe.ca
> Cc: linux-rdma@vger.kernel.org; Marciniszyn, Mike <mike.marciniszyn@intel.com>;
> Weiny, Ira <ira.weiny@intel.com>; Estrin, Alex <alex.estrin@intel.com>
> Subject: Re: [PATCH for-next 16/16] IB/ipoib: Fix for potential no-carrier state
> 
> On Fri, 2018-01-26 at 06:33 -0800, Dennis Dalessandro wrote:
> > From: Alex Estrin <alex.estrin@intel.com>
> >
> > On reboot SM can program port pkey table before ipoib registered its
> > event handler, which could result in missing pkey event and leave root
> > interface with initial pkey value from index 0.
> >
> > Since OPA port starts with invalid pkey in index 0, root interface will
> > fail to initialize and stay down with no-carrier flag.
> >
> > For IB ipoib interface may end up with pkey different from value
> > opensm put in pkey table idx 0, resulting in connectivity issues
> > (different mcast groups, for example).
> >
> > Close the window by calling event handler after registration
> > to make sure ipoib pkey is in sync with port pkey table.
> >
> > Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
> > Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Alex Estrin <alex.estrin@intel.com>
> > Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
> > ---
> >  drivers/infiniband/ulp/ipoib/ipoib_main.c |    3 +++
> >  1 files changed, 3 insertions(+), 0 deletions(-)
> >
> > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c
> b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> > index 5930c7d..161ba8c 100644
> > --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
> > +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> > @@ -2306,6 +2306,9 @@ void ipoib_set_dev_features(struct ipoib_dev_priv
> *priv, struct ib_device *hca)
> >  			      priv->ca, ipoib_event);
> >  	ib_register_event_handler(&priv->event_handler);
> >
> > +	/* call event handler to ensure pkey in sync */
> > +	queue_work(ipoib_workqueue, &priv->flush_heavy);
> > +
> 
> This seems like a bit of a sledgehammer to the issue.  Looking through
> ipoib_add_port(), the real race is that we have to call ib_query_pkey()
> early in the init sequence as some of the later steps need it to be set
> (ipoib_dev_init() must have it already set for one), but since we don't
> setup our event handler until after we've finished setting up the
> device, there is that window from our first ib_query_pkey call until we
> complete the ib_register_event_handler() call for the pkey to change.
> Instead of throwing the flush regardless, it might be nicer to do:
> 
> 	{
> 		u16 new_pkey;
> 
> 		ib_query_pkey(hca, port, 0, &new_pkey);
> 		if (priv->pkey != (new_pkey | 0x8000))
> 			/* The pkey changed between when we
> 			 * read it and now, flush the device
> 			 */
> 			queue_work(ipoib_workqueue, &priv->flush_heavy);
> 	}

Hi Doug,
The reason I did not go this way is that at this early point of initialization 
pkey handler will not operate as a "sledgehammer" to flush all cached records, 
it will  query pkey, update if changed, then exit early.
In any case we will query pkey only once, not twice as might be in your case.

Thanks,
Alex.
 
> 
> >  	result = register_netdev(priv->dev);
> >  	if (result) {
> >  		pr_warn("%s: couldn't register ipoib port %d; error %d\n",
> >
> 
> --
> Doug Ledford <dledford@redhat.com>
>     GPG KeyID: B826A3330E572FDD
>     Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

      parent reply	other threads:[~2018-01-26 17:18 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-26 14:31 [PATCH for-next 00/16] IB/hfi1,core: Driver updates for 1/26/2018 Dennis Dalessandro
     [not found] ` <20180126142640.6868.12402.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2018-01-26 14:31   ` [PATCH for-next 01/16] IB/hfi1: Do not override given pcie_pset value Dennis Dalessandro
2018-01-26 14:31   ` [PATCH for-next 02/16] IB/hfi1: Fix for early release of sdma context Dennis Dalessandro
2018-01-26 14:31   ` [PATCH for-next 03/16] IB/hfi1: Remove dependence on qp->s_hdrwords Dennis Dalessandro
2018-01-26 14:31   ` [PATCH for-next 04/16] IB/hfi1: Remove blind constants from 16B update Dennis Dalessandro
2018-01-26 14:31   ` [PATCH for-next 05/16] IB/hfi1: Convert PortXmitWait/PortVLXmitWait counters to flit times Dennis Dalessandro
2018-01-26 14:32   ` [PATCH for-next 06/16] IB/hfi1: hfi1_open_file() missing kobject_put in err path Dennis Dalessandro
     [not found]     ` <20180126143203.6868.85360.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2018-01-26 15:51       ` Jason Gunthorpe
2018-01-26 14:32   ` [PATCH for-next 07/16] IB/hfi1: Show fault stats in both TX and RX directions Dennis Dalessandro
2018-01-26 14:32   ` [PATCH for-next 08/16] IB/hfi1: Prevent LNI hang when LCB can't obtain lanes Dennis Dalessandro
2018-01-26 14:32   ` [PATCH for-next 09/16] IB/hfi1: Compute BTH only for RDMA_WRITE_LAST/SEND_LAST packet Dennis Dalessandro
2018-01-26 14:32   ` [PATCH for-next 10/16] IB/hfi1: Optimize packet type comparison using 9B and bypass code paths Dennis Dalessandro
2018-01-26 14:32   ` [PATCH for-next 11/16] IB/hfi1: Look up ibport using a pointer in receive path Dennis Dalessandro
2018-01-26 14:32   ` [PATCH for-next 12/16] IB/hfi1: Remove unnecessary fecn and becn fields Dennis Dalessandro
2018-01-26 14:33   ` [PATCH for-next 13/16] IB/hfi1: Optimize process_receive_ib() Dennis Dalessandro
2018-01-26 14:33   ` [PATCH for-next 14/16] IB/hfi1: Re-order IRQ cleanup to address driver cleanup race Dennis Dalessandro
2018-01-26 14:33   ` [PATCH for-next 15/16] IB/core: Map iWarp AH type to undefined in rdma_ah_find_type Dennis Dalessandro
     [not found]     ` <20180126143315.6868.67354.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2018-01-26 15:45       ` Jason Gunthorpe
     [not found]         ` <20180126154540.GA23869-uk2M96/98Pc@public.gmane.org>
2018-01-26 15:58           ` Dennis Dalessandro
     [not found]             ` <ce815670-cf1b-5199-9a52-2779cb8f8c75-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2018-01-26 17:47               ` Parav Pandit
2018-01-26 18:55       ` Shiraz Saleem
2018-01-26 14:33   ` [PATCH for-next 16/16] IB/ipoib: Fix for potential no-carrier state Dennis Dalessandro
     [not found]     ` <20180126143323.6868.89162.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2018-01-26 17:00       ` Doug Ledford
     [not found]         ` <1516986011.27592.207.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-26 17:06           ` Jason Gunthorpe
     [not found]             ` <20180126170610.GD23869-uk2M96/98Pc@public.gmane.org>
2018-01-26 17:09               ` Doug Ledford
     [not found]                 ` <1516986596.27592.209.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2018-01-26 17:27                   ` Estrin, Alex
2018-01-26 17:18           ` Estrin, Alex [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=F3529576D8E232409F431C309E29399336CFBE19@CRSMSX102.amr.corp.intel.com \
    --to=alex.estrin-ral2jqcrhueavxtiumwx3w@public.gmane.org \
    --cc=dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=jgg-uk2M96/98Pc@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=mike.marciniszyn-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.