All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Wu, Hao" <hao.wu@intel.com>
To: Moritz Fischer <mdf@kernel.org>, Tom Rix <trix@redhat.com>
Cc: "Weight, Russell H" <russell.h.weight@intel.com>,
	"linux-fpga@vger.kernel.org" <linux-fpga@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"lgoncalv@redhat.com" <lgoncalv@redhat.com>,
	"Xu, Yilun" <yilun.xu@intel.com>,
	"Gerlach, Matthew" <matthew.gerlach@intel.com>
Subject: RE: [PATCH v2 1/1] fpga: dfl: afu: harden port enable logic
Date: Fri, 18 Sep 2020 02:00:42 +0000	[thread overview]
Message-ID: <DM6PR11MB3819EE8D1193AA51A50EE14E853F0@DM6PR11MB3819.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20200917213850.GA30570@archbook>

> Subject: Re: [PATCH v2 1/1] fpga: dfl: afu: harden port enable logic
> 
> On Thu, Sep 17, 2020 at 01:28:22PM -0700, Tom Rix wrote:
> >
> > On 9/17/20 11:32 AM, Russ Weight wrote:
> > > Port enable is not complete until ACK = 0. Change
> > > __afu_port_enable() to guarantee that the enable process
> > > is complete by polling for ACK == 0.
> > >
> > > Signed-off-by: Russ Weight <russell.h.weight@intel.com>
> General note: Please keep a changelog if you send updated versions of a
> patch. This can be added here with an extra '---' + Text between Signed-off
> and
> diffstat:
> 
> ---
> Changes from v1:
> - FOo
> - Bar
> > > ---
> > >  drivers/fpga/dfl-afu-error.c |  2 +-
> > >  drivers/fpga/dfl-afu-main.c  | 29 +++++++++++++++++++++--------
> > >  drivers/fpga/dfl-afu.h       |  2 +-
> > >  3 files changed, 23 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/drivers/fpga/dfl-afu-error.c b/drivers/fpga/dfl-afu-error.c
> > > index c4691187cca9..0806532a3e9f 100644
> > > --- a/drivers/fpga/dfl-afu-error.c
> > > +++ b/drivers/fpga/dfl-afu-error.c
> > > @@ -103,7 +103,7 @@ static int afu_port_err_clear(struct device *dev,
> u64 err)
> > >  	__afu_port_err_mask(dev, false);
> > >
> >
> > There is an earlier bit that sets ret = -EINVAL.
> >
> > This error will be lost or not handled well.
> >
> > Right now it doesn't seem to be handled.
> 
> Ultimately you'd want to report *at least* one of them, the current code
> seems to continue and enable the port either case. Is that what it
> should be doing?

In order to do error clear, we have to put port into reset firstly and then
clear port after error clearing is done. If we see failure during error clearing
that we still want to get the port back to work at least. As we know, if
port is still in reset, then the accelerator connected to the port won't work.

> 
> Is the timeout more severe than the invalid value? Do you want to print
> a warning?

Yes, It's a very bad case if port can not be enabled any more (accelerator may
not be accessible any more), hardware should already be in error, it's better
we have some warning messages here.

> 
> Either way a comment explaining why this is ok would be appreciated :)
> >
> > >  	/* Enable the Port by clear the reset */
> > > -	__afu_port_enable(pdev);
> > > +	ret = __afu_port_enable(pdev);
> > >
> > >  done:
> > >  	mutex_unlock(&pdata->lock);
> > > diff --git a/drivers/fpga/dfl-afu-main.c b/drivers/fpga/dfl-afu-main.c
> > > index 753cda4b2568..f73b06cdf13c 100644
> > > --- a/drivers/fpga/dfl-afu-main.c
> > > +++ b/drivers/fpga/dfl-afu-main.c
> > > @@ -21,6 +21,9 @@
> > >
> > >  #include "dfl-afu.h"
> > >
> > > +#define RST_POLL_INVL 10 /* us */
> > > +#define RST_POLL_TIMEOUT 1000 /* us */
> > > +
> > >  /**
> > >   * __afu_port_enable - enable a port by clear reset
> > >   * @pdev: port platform device.
> > > @@ -32,7 +35,7 @@
> > >   *
> > >   * The caller needs to hold lock for protection.
> > >   */
> > > -void __afu_port_enable(struct platform_device *pdev)
> > > +int __afu_port_enable(struct platform_device *pdev)
> > >  {
> > >  	struct dfl_feature_platform_data *pdata = dev_get_platdata(&pdev-
> >dev);
> > >  	void __iomem *base;
> > > @@ -41,7 +44,7 @@ void __afu_port_enable(struct platform_device
> *pdev)
> > >  	WARN_ON(!pdata->disable_count);
> > >
> > >  	if (--pdata->disable_count != 0)
> > > -		return;
> > > +		return 0;
> > Is this really a success ? Maybe -EBUSY ?
> Seems like if it's severe enough for a warning you'd probably want to
> return an error.

As Yilun mentioned, this is just a reference count operation, we don't
need to return error code.

> > >
> > >  	base = dfl_get_feature_ioaddr_by_id(&pdev->dev,
> PORT_FEATURE_ID_HEADER);
> > >
> > > @@ -49,10 +52,20 @@ void __afu_port_enable(struct platform_device
> *pdev)
> > >  	v = readq(base + PORT_HDR_CTRL);
> > >  	v &= ~PORT_CTRL_SFTRST;
> > >  	writeq(v, base + PORT_HDR_CTRL);
> > > -}
> > >
> > > -#define RST_POLL_INVL 10 /* us */
> > > -#define RST_POLL_TIMEOUT 1000 /* us */
> > > +	/*
> > > +	 * HW clears the ack bit to indicate that the port is fully out
> > > +	 * of reset.
> > > +	 */
> > > +	if (readq_poll_timeout(base + PORT_HDR_CTRL, v,
> > > +			       !(v & PORT_CTRL_SFTRST_ACK),
> > > +			       RST_POLL_INVL, RST_POLL_TIMEOUT)) {
> > > +		dev_err(&pdev->dev, "timeout, failure to enable device\n");
> > > +		return -ETIMEDOUT;
> > > +	}
> > > +
> > > +	return 0;
> > > +}
> > >
> > >  /**
> > >   * __afu_port_disable - disable a port by hold reset
> > > @@ -111,7 +124,7 @@ static int __port_reset(struct platform_device
> *pdev)
> > >
> > >  	ret = __afu_port_disable(pdev);
> > >  	if (!ret)
> > > -		__afu_port_enable(pdev);
> > > +		ret = __afu_port_enable(pdev);
> > >
> > >  	return ret;
> > >  }
> > > @@ -872,11 +885,11 @@ static int afu_dev_destroy(struct
> platform_device *pdev)
> > >  static int port_enable_set(struct platform_device *pdev, bool enable)
> > >  {
> > >  	struct dfl_feature_platform_data *pdata = dev_get_platdata(&pdev-
> >dev);
> > > -	int ret = 0;
> > > +	int ret;
> > >
> > >  	mutex_lock(&pdata->lock);
> > >  	if (enable)
> > > -		__afu_port_enable(pdev);
> > > +		ret = __afu_port_enable(pdev);
> > >  	else
> > >  		ret = __afu_port_disable(pdev);
> > >  	mutex_unlock(&pdata->lock);
> > > diff --git a/drivers/fpga/dfl-afu.h b/drivers/fpga/dfl-afu.h
> > > index 576e94960086..e5020e2b1f3d 100644
> > > --- a/drivers/fpga/dfl-afu.h
> > > +++ b/drivers/fpga/dfl-afu.h
> > > @@ -80,7 +80,7 @@ struct dfl_afu {
> > >  };
> > >
> > >  /* hold pdata->lock when call __afu_port_enable/disable */
> > > -void __afu_port_enable(struct platform_device *pdev);
> > > +int __afu_port_enable(struct platform_device *pdev);
> > >  int __afu_port_disable(struct platform_device *pdev);
> >
> > The other functions in this file have afu_*  since the
> __afu_port_enable/disable
> >
> > are used other places would it make sense to remove the '__' prefix ?
> 
> The idea on those is to indicate that the caller need to be cautious
> (often a lock / mutex) is required. I think keeping them as is is fine.

Yes. That's why we add the prefix for these functions.

Thanks
Hao

> 
> >
> > If you think so, maybe a cleanup patch later.
> >
> > Tom
> >
> > >
> > >  void afu_mmio_region_init(struct dfl_feature_platform_data *pdata);
> >
> 
> Thanks,
> Moritz

  parent reply	other threads:[~2020-09-18  2:52 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-17 18:32 [PATCH v2 1/1] fpga: dfl: afu: harden port enable logic Russ Weight
2020-09-17 20:28 ` Tom Rix
2020-09-17 21:38   ` Moritz Fischer
2020-09-18  1:23     ` Xu Yilun
2020-09-18  2:00     ` Wu, Hao [this message]
2021-02-02 20:44     ` Russ Weight
2021-02-02 20:32   ` Russ Weight
2021-02-02 20:38     ` Russ Weight
2020-09-18  2:08 ` Wu, Hao
2021-02-02 20:16   ` Russ Weight
2021-02-03  9:28     ` Wu, Hao
2021-02-03 22:43       ` Russ Weight
2021-02-03 23:07         ` matthew.gerlach
2021-02-04  1:55           ` Wu, Hao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM6PR11MB3819EE8D1193AA51A50EE14E853F0@DM6PR11MB3819.namprd11.prod.outlook.com \
    --to=hao.wu@intel.com \
    --cc=lgoncalv@redhat.com \
    --cc=linux-fpga@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthew.gerlach@intel.com \
    --cc=mdf@kernel.org \
    --cc=russell.h.weight@intel.com \
    --cc=trix@redhat.com \
    --cc=yilun.xu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.