All of lore.kernel.org
 help / color / mirror / Atom feed
From: PJ Waskiewicz <pwaskiewicz@jumptrading.com>
To: "Dziedziuch, SylwesterX" <sylwesterx.dziedziuch@intel.com>,
	"Nguyen, Anthony L" <anthony.l.nguyen@intel.com>
Cc: "davem@davemloft.net" <davem@davemloft.net>,
	"pjwaskiewicz@gmail.com" <pjwaskiewicz@gmail.com>,
	"Fijalkowski, Maciej" <maciej.fijalkowski@intel.com>,
	"Loktionov, Aleksandr" <aleksandr.loktionov@intel.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"Brandeburg, Jesse" <jesse.brandeburg@intel.com>,
	"intel-wired-lan@lists.osuosl.org"
	<intel-wired-lan@lists.osuosl.org>,
	"Machnikowski, Maciej" <maciej.machnikowski@intel.com>
Subject: RE: [PATCH 1/1] i40e: Avoid double IRQ free on error path in probe()
Date: Sat, 18 Sep 2021 02:01:43 +0000	[thread overview]
Message-ID: <MW4PR14MB4796F279908B7E4D11622C66A1DE9@MW4PR14MB4796.namprd14.prod.outlook.com> (raw)
In-Reply-To: <DM6PR11MB3371B4431AD7C46672C7E439E6DB9@DM6PR11MB3371.namprd11.prod.outlook.com>

Hi Sylwester,

>
> You are right the problem is with misc IRQ vector but as far as I can see this
> patch only moves i40e_reset_interrupt_capability() outside of
> i40e_clear_interrupt_scheme(). It does not fix the problem of
> i40e_free_misc_vector() on unallocated vector in error path. We have a
> proper fix for this that adds additional check for
> __I40E_MISC_IRQ_REQUESTED bit to i40e_free_misc_vector():

It does fix the problem if you call the function when the MISC vector hasn't been allocated.  Yes, I moved reset_interrupt_capability() out so it could be separately called in the probe() error cleanup path.

>         if (pf->flags & I40E_FLAG_MSIX_ENABLED && pf->msix_entries &&
>             test_bit(__I40E_MISC_IRQ_REQUESTED, pf->state)) {
>
> This bit is set only if misc vector was properly allocated. The patch will be on
> intel-wired soon.

This isn't even in the OOT driver from SourceForge.  And even if you used that to guard freeing the vector or not, the first bit of that function is still writing to a register to disable that cause in the hardware:

static void i40e_free_misc_vector(struct i40e_pf *pf)
{
        /* Disable ICR 0 */
        wr32(&pf->hw, I40E_PFINT_ICR0_ENA, 0);
        i40e_flush(&pf->hw);

Would you still want to do that blindly if the vector wasn't allocated in the first place?  Seems excessive, but it'd be harmless.  Seems like not calling this function altogether would be cleaner and generate less MMIO activity if the MISC vector wasn't allocated at all and we're falling out of an error path...

I am really at a loss here.  This is clearly broken.  We have an Oops.  We get these occasionally on boot, and it's really annoying to deal with on production machines.  What is the definition of "soon" here for this new patch to show up?  My distro vendor would love to pull some sort of fix in so we can get it into our build images, and stop having this problem.  My patch fixes the immediate problem.  If you don't like the patch (which it appears you don't; that's fine), then stalling or saying a different fix is coming "soon" is really not a great support model.  This would be great to merge, and then if you want to make it "better" on your schedule, it's open source, and you can submit a patch.  Or I'll be happy to respin the patch, but still calling free_misc_vector() in an error path when the MISC vector was never allocated seems like a bad design decision to me.

-PJ

________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential, or privileged information and/or personal data. If you are not the intended recipient, you are hereby notified that any review, dissemination, or copying of this email is strictly prohibited, and requested to notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request, or solicitation of any kind to buy, sell, subscribe, redeem, or perform any type of transaction of a financial product. Personal data, as defined by applicable data protection and privacy laws, contained in this email may be processed by the Company, and any of its affiliated or related companies, for legal, compliance, and/or business-related purposes. You may have rights regarding your personal data; for information on exercising these rights or the Company’s treatment of personal data, please email datarequests@jumptrading.com.

WARNING: multiple messages have this Message-ID
From: PJ Waskiewicz <pwaskiewicz@jumptrading.com>
To: intel-wired-lan@osuosl.org
Subject: [Intel-wired-lan] [PATCH 1/1] i40e: Avoid double IRQ free on error path in probe()
Date: Sat, 18 Sep 2021 02:01:43 +0000	[thread overview]
Message-ID: <MW4PR14MB4796F279908B7E4D11622C66A1DE9@MW4PR14MB4796.namprd14.prod.outlook.com> (raw)
In-Reply-To: <DM6PR11MB3371B4431AD7C46672C7E439E6DB9@DM6PR11MB3371.namprd11.prod.outlook.com>

Hi Sylwester,

>
> You are right the problem is with misc IRQ vector but as far as I can see this
> patch only moves i40e_reset_interrupt_capability() outside of
> i40e_clear_interrupt_scheme(). It does not fix the problem of
> i40e_free_misc_vector() on unallocated vector in error path. We have a
> proper fix for this that adds additional check for
> __I40E_MISC_IRQ_REQUESTED bit to i40e_free_misc_vector():

It does fix the problem if you call the function when the MISC vector hasn't been allocated.  Yes, I moved reset_interrupt_capability() out so it could be separately called in the probe() error cleanup path.

>         if (pf->flags & I40E_FLAG_MSIX_ENABLED && pf->msix_entries &&
>             test_bit(__I40E_MISC_IRQ_REQUESTED, pf->state)) {
>
> This bit is set only if misc vector was properly allocated. The patch will be on
> intel-wired soon.

This isn't even in the OOT driver from SourceForge.  And even if you used that to guard freeing the vector or not, the first bit of that function is still writing to a register to disable that cause in the hardware:

static void i40e_free_misc_vector(struct i40e_pf *pf)
{
        /* Disable ICR 0 */
        wr32(&pf->hw, I40E_PFINT_ICR0_ENA, 0);
        i40e_flush(&pf->hw);

Would you still want to do that blindly if the vector wasn't allocated in the first place?  Seems excessive, but it'd be harmless.  Seems like not calling this function altogether would be cleaner and generate less MMIO activity if the MISC vector wasn't allocated at all and we're falling out of an error path...

I am really at a loss here.  This is clearly broken.  We have an Oops.  We get these occasionally on boot, and it's really annoying to deal with on production machines.  What is the definition of "soon" here for this new patch to show up?  My distro vendor would love to pull some sort of fix in so we can get it into our build images, and stop having this problem.  My patch fixes the immediate problem.  If you don't like the patch (which it appears you don't; that's fine), then stalling or saying a different fix is coming "soon" is really not a great support model.  This would be great to merge, and then if you want to make it "better" on your schedule, it's open source, and you can submit a patch.  Or I'll be happy to respin the patch, but still calling free_misc_vector() in an error path when the MISC vector was never allocated seems like a bad design decision to me.

-PJ

________________________________

Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential, or privileged information and/or personal data. If you are not the intended recipient, you are hereby notified that any review, dissemination, or copying of this email is strictly prohibited, and requested to notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request, or solicitation of any kind to buy, sell, subscribe, redeem, or perform any type of transaction of a financial product. Personal data, as defined by applicable data protection and privacy laws, contained in this email may be processed by the Company, and any of its affiliated or related companies, for legal, compliance, and/or business-related purposes. You may have rights regarding your personal data; for information on exercising these rights or the Company?s treatment of personal data, please email datarequests at jumptrading.com.

  reply	other threads:[~2021-09-18  2:01 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-26 22:19 [PATCH 1/1] i40e: Avoid double IRQ free on error path in probe() PJ Waskiewicz
2021-08-26 22:19 ` [Intel-wired-lan] " PJ Waskiewicz
2021-08-30 20:52 ` Nguyen, Anthony L
2021-08-30 20:52   ` [Intel-wired-lan] " Nguyen, Anthony L
2021-08-31 20:58   ` PJ Waskiewicz
2021-08-31 20:58     ` [Intel-wired-lan] " PJ Waskiewicz
2021-09-13 19:37     ` PJ Waskiewicz
2021-09-13 19:37       ` [Intel-wired-lan] " PJ Waskiewicz
2021-09-13 20:29       ` Nguyen, Anthony L
2021-09-13 20:29         ` [Intel-wired-lan] " Nguyen, Anthony L
2021-09-14  8:23         ` Dziedziuch, SylwesterX
2021-09-14  8:23           ` [Intel-wired-lan] " Dziedziuch, SylwesterX
2021-09-14 21:40           ` PJ Waskiewicz
2021-09-14 21:40             ` [Intel-wired-lan] " PJ Waskiewicz
2021-09-15  9:53             ` Dziedziuch, SylwesterX
2021-09-15  9:53               ` [Intel-wired-lan] " Dziedziuch, SylwesterX
2021-09-18  2:01               ` PJ Waskiewicz [this message]
2021-09-18  2:01                 ` PJ Waskiewicz
2021-09-20  7:48                 ` Dziedziuch, SylwesterX
2021-09-20  7:48                   ` [Intel-wired-lan] " Dziedziuch, SylwesterX
2021-09-21 17:06                   ` PJ Waskiewicz
2021-09-21 17:06                     ` [Intel-wired-lan] " PJ Waskiewicz
2021-09-23 15:17                     ` PJ Waskiewicz
2021-09-23 15:17                       ` [Intel-wired-lan] " PJ Waskiewicz
2021-09-24  7:04                       ` Dziedziuch, SylwesterX
2021-09-24  7:04                         ` [Intel-wired-lan] " Dziedziuch, SylwesterX
  -- strict thread matches above, loose matches on Subject: below --
2021-08-25 19:23 PJ Waskiewicz
2021-08-26  8:03 ` Maciej Fijalkowski
2021-08-26 14:26   ` PJ Waskiewicz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MW4PR14MB4796F279908B7E4D11622C66A1DE9@MW4PR14MB4796.namprd14.prod.outlook.com \
    --to=pwaskiewicz@jumptrading.com \
    --cc=aleksandr.loktionov@intel.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=davem@davemloft.net \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jesse.brandeburg@intel.com \
    --cc=maciej.fijalkowski@intel.com \
    --cc=maciej.machnikowski@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pjwaskiewicz@gmail.com \
    --cc=sylwesterx.dziedziuch@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.