All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dziedziuch, SylwesterX" <sylwesterx.dziedziuch@intel.com>
To: PJ Waskiewicz <pwaskiewicz@jumptrading.com>,
	"Nguyen, Anthony L" <anthony.l.nguyen@intel.com>
Cc: "davem@davemloft.net" <davem@davemloft.net>,
	"pjwaskiewicz@gmail.com" <pjwaskiewicz@gmail.com>,
	"Fijalkowski, Maciej" <maciej.fijalkowski@intel.com>,
	"Loktionov, Aleksandr" <aleksandr.loktionov@intel.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"Brandeburg, Jesse" <jesse.brandeburg@intel.com>,
	"intel-wired-lan@lists.osuosl.org"
	<intel-wired-lan@lists.osuosl.org>,
	"Machnikowski, Maciej" <maciej.machnikowski@intel.com>
Subject: RE: [PATCH 1/1] i40e: Avoid double IRQ free on error path in probe()
Date: Mon, 20 Sep 2021 07:48:20 +0000	[thread overview]
Message-ID: <DM6PR11MB3371A285032A583A73BA62E0E6A09@DM6PR11MB3371.namprd11.prod.outlook.com> (raw)
In-Reply-To: <MW4PR14MB4796F279908B7E4D11622C66A1DE9@MW4PR14MB4796.namprd14.prod.outlook.com>

Hello PJ,

> -----Original Message-----
> From: PJ Waskiewicz <pwaskiewicz@jumptrading.com>
> Sent: Saturday, September 18, 2021 4:02 AM
> To: Dziedziuch, SylwesterX <sylwesterx.dziedziuch@intel.com>; Nguyen,
> Anthony L <anthony.l.nguyen@intel.com>
> Cc: davem@davemloft.net; pjwaskiewicz@gmail.com; Fijalkowski, Maciej
> <maciej.fijalkowski@intel.com>; Loktionov, Aleksandr
> <aleksandr.loktionov@intel.com>; netdev@vger.kernel.org; Brandeburg, Jesse
> <jesse.brandeburg@intel.com>; intel-wired-lan@lists.osuosl.org;
> Machnikowski, Maciej <maciej.machnikowski@intel.com>
> Subject: RE: [PATCH 1/1] i40e: Avoid double IRQ free on error path in probe()
> 
> Hi Sylwester,
> 
> >
> > You are right the problem is with misc IRQ vector but as far as I can
> > see this patch only moves i40e_reset_interrupt_capability() outside of
> > i40e_clear_interrupt_scheme(). It does not fix the problem of
> > i40e_free_misc_vector() on unallocated vector in error path. We have a
> > proper fix for this that adds additional check for
> > __I40E_MISC_IRQ_REQUESTED bit to i40e_free_misc_vector():
> 
> It does fix the problem if you call the function when the MISC vector hasn't
> been allocated.  Yes, I moved reset_interrupt_capability() out so it could be
> separately called in the probe() error cleanup path.
> 
> >         if (pf->flags & I40E_FLAG_MSIX_ENABLED && pf->msix_entries &&
> >             test_bit(__I40E_MISC_IRQ_REQUESTED, pf->state)) {
> >
> > This bit is set only if misc vector was properly allocated. The patch
> > will be on intel-wired soon.
> 
> This isn't even in the OOT driver from SourceForge.  And even if you used that
> to guard freeing the vector or not, the first bit of that function is still writing to
> a register to disable that cause in the hardware:
> 
> static void i40e_free_misc_vector(struct i40e_pf *pf) {
>         /* Disable ICR 0 */
>         wr32(&pf->hw, I40E_PFINT_ICR0_ENA, 0);
>         i40e_flush(&pf->hw);
> 
> Would you still want to do that blindly if the vector wasn't allocated in the first
> place?  Seems excessive, but it'd be harmless.  Seems like not calling this
> function altogether would be cleaner and generate less MMIO activity if the
> MISC vector wasn't allocated at all and we're falling out of an error path...
> 
> I am really at a loss here.  This is clearly broken.  We have an Oops.  We get
> these occasionally on boot, and it's really annoying to deal with on production
> machines.  What is the definition of "soon" here for this new patch to show
> up?  My distro vendor would love to pull some sort of fix in so we can get it
> into our build images, and stop having this problem.  My patch fixes the
> immediate problem.  If you don't like the patch (which it appears you don't;
> that's fine), then stalling or saying a different fix is coming "soon" is really not a
> great support model.  This would be great to merge, and then if you want to
> make it "better" on your schedule, it's open source, and you can submit a
> patch.  Or I'll be happy to respin the patch, but still calling free_misc_vector()
> in an error path when the MISC vector was never allocated seems like a bad
> design decision to me.
> 
> -PJ

I totally agree that we shouldn’t call free_misc_vector if misc vector wasn't allocated yet but to me this is not what your patch is doing. free_misc_vector() is called in clear_interrupt_scheme not reset_interrupt_capability(). In your patch clear_interrupt_scheme() will still be called when pf switch setup fails in i40e_probe() and it will still call free_misc_vector on unallocated misc IRQ. Other proper way to fix this would be moving free_misc_vector() out of clear_interrupt_scheme() and calling it separately when misc vector was really allocated. As for the hw register being written in our patch as you said it's harmless. The patch we've prepared should be on iwl today.

BR
Sylwester Dziedziuch

> 
> ________________________________
> 
> Note: This email is for the confidential use of the named addressee(s) only and
> may contain proprietary, confidential, or privileged information and/or
> personal data. If you are not the intended recipient, you are hereby notified
> that any review, dissemination, or copying of this email is strictly prohibited,
> and requested to notify the sender immediately and destroy this email and
> any attachments. Email transmission cannot be guaranteed to be secure or
> error-free. The Company, therefore, does not make any guarantees as to the
> completeness or accuracy of this email or any attachments. This email is for
> informational purposes only and does not constitute a recommendation, offer,
> request, or solicitation of any kind to buy, sell, subscribe, redeem, or perform
> any type of transaction of a financial product. Personal data, as defined by
> applicable data protection and privacy laws, contained in this email may be
> processed by the Company, and any of its affiliated or related companies, for
> legal, compliance, and/or business-related purposes. You may have rights
> regarding your personal data; for information on exercising these rights or the
> Company’s treatment of personal data, please email
> datarequests@jumptrading.com.

WARNING: multiple messages have this Message-ID (diff)
From: Dziedziuch, SylwesterX <sylwesterx.dziedziuch@intel.com>
To: intel-wired-lan@osuosl.org
Subject: [Intel-wired-lan] [PATCH 1/1] i40e: Avoid double IRQ free on error path in probe()
Date: Mon, 20 Sep 2021 07:48:20 +0000	[thread overview]
Message-ID: <DM6PR11MB3371A285032A583A73BA62E0E6A09@DM6PR11MB3371.namprd11.prod.outlook.com> (raw)
In-Reply-To: <MW4PR14MB4796F279908B7E4D11622C66A1DE9@MW4PR14MB4796.namprd14.prod.outlook.com>

Hello PJ,

> -----Original Message-----
> From: PJ Waskiewicz <pwaskiewicz@jumptrading.com>
> Sent: Saturday, September 18, 2021 4:02 AM
> To: Dziedziuch, SylwesterX <sylwesterx.dziedziuch@intel.com>; Nguyen,
> Anthony L <anthony.l.nguyen@intel.com>
> Cc: davem at davemloft.net; pjwaskiewicz at gmail.com; Fijalkowski, Maciej
> <maciej.fijalkowski@intel.com>; Loktionov, Aleksandr
> <aleksandr.loktionov@intel.com>; netdev at vger.kernel.org; Brandeburg, Jesse
> <jesse.brandeburg@intel.com>; intel-wired-lan at lists.osuosl.org;
> Machnikowski, Maciej <maciej.machnikowski@intel.com>
> Subject: RE: [PATCH 1/1] i40e: Avoid double IRQ free on error path in probe()
> 
> Hi Sylwester,
> 
> >
> > You are right the problem is with misc IRQ vector but as far as I can
> > see this patch only moves i40e_reset_interrupt_capability() outside of
> > i40e_clear_interrupt_scheme(). It does not fix the problem of
> > i40e_free_misc_vector() on unallocated vector in error path. We have a
> > proper fix for this that adds additional check for
> > __I40E_MISC_IRQ_REQUESTED bit to i40e_free_misc_vector():
> 
> It does fix the problem if you call the function when the MISC vector hasn't
> been allocated.  Yes, I moved reset_interrupt_capability() out so it could be
> separately called in the probe() error cleanup path.
> 
> >         if (pf->flags & I40E_FLAG_MSIX_ENABLED && pf->msix_entries &&
> >             test_bit(__I40E_MISC_IRQ_REQUESTED, pf->state)) {
> >
> > This bit is set only if misc vector was properly allocated. The patch
> > will be on intel-wired soon.
> 
> This isn't even in the OOT driver from SourceForge.  And even if you used that
> to guard freeing the vector or not, the first bit of that function is still writing to
> a register to disable that cause in the hardware:
> 
> static void i40e_free_misc_vector(struct i40e_pf *pf) {
>         /* Disable ICR 0 */
>         wr32(&pf->hw, I40E_PFINT_ICR0_ENA, 0);
>         i40e_flush(&pf->hw);
> 
> Would you still want to do that blindly if the vector wasn't allocated in the first
> place?  Seems excessive, but it'd be harmless.  Seems like not calling this
> function altogether would be cleaner and generate less MMIO activity if the
> MISC vector wasn't allocated at all and we're falling out of an error path...
> 
> I am really at a loss here.  This is clearly broken.  We have an Oops.  We get
> these occasionally on boot, and it's really annoying to deal with on production
> machines.  What is the definition of "soon" here for this new patch to show
> up?  My distro vendor would love to pull some sort of fix in so we can get it
> into our build images, and stop having this problem.  My patch fixes the
> immediate problem.  If you don't like the patch (which it appears you don't;
> that's fine), then stalling or saying a different fix is coming "soon" is really not a
> great support model.  This would be great to merge, and then if you want to
> make it "better" on your schedule, it's open source, and you can submit a
> patch.  Or I'll be happy to respin the patch, but still calling free_misc_vector()
> in an error path when the MISC vector was never allocated seems like a bad
> design decision to me.
> 
> -PJ

I totally agree that we shouldn?t call free_misc_vector if misc vector wasn't allocated yet but to me this is not what your patch is doing. free_misc_vector() is called in clear_interrupt_scheme not reset_interrupt_capability(). In your patch clear_interrupt_scheme() will still be called when pf switch setup fails in i40e_probe() and it will still call free_misc_vector on unallocated misc IRQ. Other proper way to fix this would be moving free_misc_vector() out of clear_interrupt_scheme() and calling it separately when misc vector was really allocated. As for the hw register being written in our patch as you said it's harmless. The patch we've prepared should be on iwl today.

BR
Sylwester Dziedziuch

> 
> ________________________________
> 
> Note: This email is for the confidential use of the named addressee(s) only and
> may contain proprietary, confidential, or privileged information and/or
> personal data. If you are not the intended recipient, you are hereby notified
> that any review, dissemination, or copying of this email is strictly prohibited,
> and requested to notify the sender immediately and destroy this email and
> any attachments. Email transmission cannot be guaranteed to be secure or
> error-free. The Company, therefore, does not make any guarantees as to the
> completeness or accuracy of this email or any attachments. This email is for
> informational purposes only and does not constitute a recommendation, offer,
> request, or solicitation of any kind to buy, sell, subscribe, redeem, or perform
> any type of transaction of a financial product. Personal data, as defined by
> applicable data protection and privacy laws, contained in this email may be
> processed by the Company, and any of its affiliated or related companies, for
> legal, compliance, and/or business-related purposes. You may have rights
> regarding your personal data; for information on exercising these rights or the
> Company?s treatment of personal data, please email
> datarequests at jumptrading.com.

  reply	other threads:[~2021-09-20  7:51 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-26 22:19 [PATCH 1/1] i40e: Avoid double IRQ free on error path in probe() PJ Waskiewicz
2021-08-26 22:19 ` [Intel-wired-lan] " PJ Waskiewicz
2021-08-30 20:52 ` Nguyen, Anthony L
2021-08-30 20:52   ` [Intel-wired-lan] " Nguyen, Anthony L
2021-08-31 20:58   ` PJ Waskiewicz
2021-08-31 20:58     ` [Intel-wired-lan] " PJ Waskiewicz
2021-09-13 19:37     ` PJ Waskiewicz
2021-09-13 19:37       ` [Intel-wired-lan] " PJ Waskiewicz
2021-09-13 20:29       ` Nguyen, Anthony L
2021-09-13 20:29         ` [Intel-wired-lan] " Nguyen, Anthony L
2021-09-14  8:23         ` Dziedziuch, SylwesterX
2021-09-14  8:23           ` [Intel-wired-lan] " Dziedziuch, SylwesterX
2021-09-14 21:40           ` PJ Waskiewicz
2021-09-14 21:40             ` [Intel-wired-lan] " PJ Waskiewicz
2021-09-15  9:53             ` Dziedziuch, SylwesterX
2021-09-15  9:53               ` [Intel-wired-lan] " Dziedziuch, SylwesterX
2021-09-18  2:01               ` PJ Waskiewicz
2021-09-18  2:01                 ` [Intel-wired-lan] " PJ Waskiewicz
2021-09-20  7:48                 ` Dziedziuch, SylwesterX [this message]
2021-09-20  7:48                   ` Dziedziuch, SylwesterX
2021-09-21 17:06                   ` PJ Waskiewicz
2021-09-21 17:06                     ` [Intel-wired-lan] " PJ Waskiewicz
2021-09-23 15:17                     ` PJ Waskiewicz
2021-09-23 15:17                       ` [Intel-wired-lan] " PJ Waskiewicz
2021-09-24  7:04                       ` Dziedziuch, SylwesterX
2021-09-24  7:04                         ` [Intel-wired-lan] " Dziedziuch, SylwesterX
  -- strict thread matches above, loose matches on Subject: below --
2021-08-25 19:23 PJ Waskiewicz
2021-08-26  8:03 ` Maciej Fijalkowski
2021-08-26 14:26   ` PJ Waskiewicz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM6PR11MB3371A285032A583A73BA62E0E6A09@DM6PR11MB3371.namprd11.prod.outlook.com \
    --to=sylwesterx.dziedziuch@intel.com \
    --cc=aleksandr.loktionov@intel.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=davem@davemloft.net \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jesse.brandeburg@intel.com \
    --cc=maciej.fijalkowski@intel.com \
    --cc=maciej.machnikowski@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pjwaskiewicz@gmail.com \
    --cc=pwaskiewicz@jumptrading.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.