linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brian Norris <briannorris@chromium.org>
To: Ganapathi Bhat <gbhat@marvell.com>,
	Nishant Sarmukadam <nishants@marvell.com>
Cc: linux-kernel@vger.kernel.org,
	Dmitry Torokhov <dmitry.torokhov@gmail.com>,
	Amitkumar Karwar <amitkarwar@gmail.com>,
	Kalle Valo <kvalo@codeaurora.org>,
	linux-wireless@vger.kernel.org
Subject: Re: [PATCH 01/14] mwifiex: pcie: properly synchronize, disable interrupts in driver callbacks
Date: Wed, 31 May 2017 10:10:59 -0700	[thread overview]
Message-ID: <20170531171058.GA54739@google.com> (raw)
In-Reply-To: <20170525001119.64791-1-briannorris@chromium.org>

By the way, this had a few review comments elsewhere, which I'll
summarize here, since I plan to resubmit a new version sometime.

On Wed, May 24, 2017 at 05:11:06PM -0700, Brian Norris wrote:
> It seems that the implicit assumption of the mwifiex
> {enable,disable}_int() callbacks is that after ->disable_int(), all
> interrupt handling should be complete (synchronized) and not fire again
> until after ->enable_int(). Also, interrupts should not be serviced
> until after the first ->enable_int().
> 
> However, the PCIe driver does none of this. First, the existing
> interrupt mask programming appears to only have an effect for legacy
> interrupts. It doesn't actually prevent MSI/MSI-X interrupts. Second,
> even when it might mask interrupts, we're doing nothing to ensure that
> pending IRQs have finished processing; they could be already in-flight
> when a CPU masks them.
> 
> Another quirk of this driver's design is the use of a racy
> "surprise_removed" check in mwifiex_pcie_interrupt(). This appears to
> act like a racy poor-man's version of masking our interrupts -- it
> allows us to short-circuit the ISR if it fires when we're not prepared
> to handle more work.
> 
> We can resolve this all by:
> (a) disabling our IRQs after requesting them
> (b) call {enable,disable}_irq() in the {enable,disable}_int() callbacks
> (c) remove the racy '->surprise_removed' hack from
>     mwifiex_pcie_interrupt()
> (d) document the effect (or lack thereof) of PCIE_HOST_INT_MASK, to
>     clarify and possibly prevent future misuse
> 
> Along the way, I decided to use underscores to prefix the driver-private
> forms of "disabling interrupts" (instead of the awkward "_noerr" suffix
> used already), partly to discourage their use.
> 
> Signed-off-by: Brian Norris <briannorris@chromium.org>
> ---
>  drivers/net/wireless/marvell/mwifiex/pcie.c | 70 ++++++++++++++++++++---------
>  1 file changed, 49 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
> index 394224d6c219..ea75315bf19d 100644
> --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
> +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
> @@ -505,12 +505,10 @@ static int mwifiex_pm_wakeup_card_complete(struct mwifiex_adapter *adapter)
>  }
>  
>  /*
> - * This function disables the host interrupt.
> - *
> - * The host interrupt mask is read, the disable bit is reset and
> - * written back to the card host interrupt mask register.
> + * This function masks the host interrupt. Effective only for legacy PCI
> + * interrupts.
>   */
> -static int mwifiex_pcie_disable_host_int(struct mwifiex_adapter *adapter)
> +static int __mwifiex_pcie_disable_host_int(struct mwifiex_adapter *adapter)
>  {
>  	if (mwifiex_pcie_ok_to_access_hw(adapter)) {
>  		if (mwifiex_write_reg(adapter, PCIE_HOST_INT_MASK,
> @@ -525,18 +523,30 @@ static int mwifiex_pcie_disable_host_int(struct mwifiex_adapter *adapter)
>  	return 0;
>  }
>  
> -static void mwifiex_pcie_disable_host_int_noerr(struct mwifiex_adapter *adapter)
> +/*
> + * Disable interrupts, synchronizing with any outstanding interrupts.
> + */
> +static void mwifiex_pcie_disable_host_int(struct mwifiex_adapter *adapter)
>  {
> -	WARN_ON(mwifiex_pcie_disable_host_int(adapter));
> +	struct pcie_service_card *card = adapter->card;
> +	int i;
> +
> +	WARN_ON(__mwifiex_pcie_disable_host_int(adapter));
> +
> +	if (card->msix_enable) {
> +		for (i = 0; i < MWIFIEX_NUM_MSIX_VECTORS; i++) {
> +			disable_irq(card->msix_entries[i].vector);
> +		}
> +	} else {
> +		disable_irq(card->dev->irq);

This approach is not safe for the non-MSI-X case, since we actually
requested this IRQ with IRQF_SHARED. That's likely mostly for the legacy
PCI interrupt case (where we *have* to support shared interrupts) and
could probably be modified, but at any rate, this is unsafe as written.

Also, I've fielded objections to using the host-level IRQ masking for
disabling MSI interrupts here. I'm still not completely sure *why* the
objection, but I'm investigating whether there's any device-level
mechanism for disabling MSI interrupts on the Wif card. (Marvell folks,
feel free to speak up here.)

> +	}
>  }
>  
>  /*
> - * This function enables the host interrupt.
> - *
> - * The host interrupt enable mask is written to the card
> - * host interrupt mask register.
> + * This function unmasks the host interrupt. Effective only for legacy PCI
> + * interrupts.
>   */
> -static int mwifiex_pcie_enable_host_int(struct mwifiex_adapter *adapter)
> +static int __mwifiex_pcie_enable_host_int(struct mwifiex_adapter *adapter)
>  {
>  	if (mwifiex_pcie_ok_to_access_hw(adapter)) {
>  		/* Simply write the mask to the register */
> @@ -551,6 +561,26 @@ static int mwifiex_pcie_enable_host_int(struct mwifiex_adapter *adapter)
>  	return 0;
>  }
>  
> +static int mwifiex_pcie_enable_host_int(struct mwifiex_adapter *adapter)
> +{
> +	struct pcie_service_card *card = adapter->card;
> +	int i, ret;
> +
> +	ret = __mwifiex_pcie_enable_host_int(adapter);
> +	if (ret)
> +		return ret;
> +
> +	if (card->msix_enable) {
> +		for (i = 0; i < MWIFIEX_NUM_MSIX_VECTORS; i++) {
> +			enable_irq(card->msix_entries[i].vector);
> +		}
> +	} else {
> +		enable_irq(card->dev->irq);
> +	}
> +
> +	return 0;
> +}
> +
>  /*
>   * This function initializes TX buffer ring descriptors
>   */
> @@ -1738,7 +1768,7 @@ static int mwifiex_pcie_process_cmd_complete(struct mwifiex_adapter *adapter)
>  			while (reg->sleep_cookie && (count++ < 10) &&
>  			       mwifiex_pcie_ok_to_access_hw(adapter))
>  				usleep_range(50, 60);
> -			mwifiex_pcie_enable_host_int(adapter);
> +			__mwifiex_pcie_enable_host_int(adapter);
>  			mwifiex_process_sleep_confirm_resp(adapter, skb->data,
>  							   skb->len);
>  		} else {
> @@ -2081,7 +2111,7 @@ static int mwifiex_prog_fw_w_helper(struct mwifiex_adapter *adapter,
>  		    "info: Downloading FW image (%d bytes)\n",
>  		    firmware_len);
>  
> -	if (mwifiex_pcie_disable_host_int(adapter)) {
> +	if (__mwifiex_pcie_disable_host_int(adapter)) {
>  		mwifiex_dbg(adapter, ERROR,
>  			    "%s: Disabling interrupts failed.\n", __func__);
>  		return -1;
> @@ -2335,8 +2365,7 @@ static void mwifiex_interrupt_status(struct mwifiex_adapter *adapter,
>  		if ((pcie_ireg == 0xFFFFFFFF) || !pcie_ireg)
>  			return;
>  
> -
> -		mwifiex_pcie_disable_host_int(adapter);
> +		__mwifiex_pcie_disable_host_int(adapter);
>  
>  		/* Clear the pending interrupts */
>  		if (mwifiex_write_reg(adapter, PCIE_HOST_INT_STATUS,
> @@ -2387,9 +2416,6 @@ static irqreturn_t mwifiex_pcie_interrupt(int irq, void *context)
>  	}
>  	adapter = card->adapter;
>  
> -	if (adapter->surprise_removed)
> -		goto exit;
> -
>  	if (card->msix_enable)
>  		mwifiex_interrupt_status(adapter, ctx->msg_id);
>  	else
> @@ -2494,7 +2520,7 @@ static int mwifiex_process_pcie_int(struct mwifiex_adapter *adapter)
>  		    "info: cmd_sent=%d data_sent=%d\n",
>  		    adapter->cmd_sent, adapter->data_sent);
>  	if (!card->msi_enable && adapter->ps_state != PS_STATE_SLEEP)
> -		mwifiex_pcie_enable_host_int(adapter);
> +		__mwifiex_pcie_enable_host_int(adapter);
>  
>  	return 0;
>  }
> @@ -3055,6 +3081,7 @@ static int mwifiex_pcie_request_irq(struct mwifiex_adapter *adapter)
>  						  &card->msix_ctx[i]);
>  				if (ret)
>  					break;
> +				disable_irq(card->msix_entries[i].vector);

Also, if we're really dealing with spurious interrupts at init time,
then this leaves a window of time in between request_irq() and this
disable_irq() in which we could still receive a bad IRQ. So this should
be reworked to do one of:
(a) move the request_irq() later, until we're really able to handle
interrupts
(b) set the IRQ_NOAUTOEN flag (for the non-shared case), to avoid
enabling IRQs initially
(c) use some sort of (yet-unknown) device-level mask for MSI interrupts.

I'm looking to address these problems in a v2. Many of the other patches
are likely independent. I'll plan to resubmit them in the next series
(if they aren't applied before then), to avoid conflicts with those that
aren't independent, and because I intentionally put bugfixes (like this
patch) first in the series.

Brian

>  			}
>  
>  			if (ret) {
> @@ -3087,6 +3114,7 @@ static int mwifiex_pcie_request_irq(struct mwifiex_adapter *adapter)
>  		pr_err("request_irq failed: ret=%d\n", ret);
>  		return -1;
>  	}
> +	disable_irq(pdev->irq);
>  
>  	return 0;
>  }
> @@ -3244,7 +3272,7 @@ static struct mwifiex_if_ops pcie_ops = {
>  	.register_dev =			mwifiex_register_dev,
>  	.unregister_dev =		mwifiex_unregister_dev,
>  	.enable_int =			mwifiex_pcie_enable_host_int,
> -	.disable_int =			mwifiex_pcie_disable_host_int_noerr,
> +	.disable_int =			mwifiex_pcie_disable_host_int,
>  	.process_int_status =		mwifiex_process_int_status,
>  	.host_to_card =			mwifiex_pcie_host_to_card,
>  	.wakeup =			mwifiex_pm_wakeup_card,
> -- 
> 2.13.0.219.gdb65acc882-goog
> 

  parent reply	other threads:[~2017-05-31 17:11 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-25  0:11 [PATCH 01/14] mwifiex: pcie: properly synchronize, disable interrupts in driver callbacks Brian Norris
2017-05-25  0:11 ` [PATCH 02/14] mwifiex: reunite copy-and-pasted remove/reset code Brian Norris
2017-05-25  0:11 ` [PATCH 03/14] mwifiex: reset interrupt status across device reset Brian Norris
2017-05-25  0:11 ` [PATCH 04/14] mwifiex: pcie: don't allow cmd buffer reuse after reset Brian Norris
2017-05-25  0:11 ` [PATCH 05/14] mwifiex: re-register wiphy across reset Brian Norris
2017-06-01  9:15   ` Kalle Valo
2017-06-01 17:39     ` Brian Norris
2017-06-05 15:54       ` Kalle Valo
2017-06-09  9:03         ` Johannes Berg
2017-06-21 18:27           ` Brian Norris
2017-06-22 13:02             ` Johannes Berg
2017-06-27 20:48               ` Brian Norris
2017-06-28  7:28                 ` Johannes Berg
2017-06-29 18:45                   ` Brian Norris
2017-07-04 14:10                 ` Kalle Valo
2017-06-21 17:48         ` Brian Norris
2017-06-22 12:59           ` Johannes Berg
2017-06-27 19:50             ` Brian Norris
2017-06-28  7:21               ` Johannes Berg
2017-05-25  0:11 ` [PATCH 06/14] mwifiex: don't short-circuit netdev notifiers on interface deletion Brian Norris
2017-05-25  0:11 ` [PATCH 07/14] mwifiex: fixup init_channel_scan_gap error case Brian Norris
2017-05-25  0:11 ` [PATCH 08/14] mwifiex: ensure "disable auto DS" struct is initialized Brian Norris
2017-05-25  0:11 ` [PATCH 09/14] mwifiex: pcie: remove redundant synchronize_irq() Brian Norris
2017-05-25  0:11 ` [PATCH 10/14] mwifiex: pcie: stop masking interrupts in FW downloader Brian Norris
2017-05-25  0:11 ` [PATCH 11/14] mwifiex: utilize netif_tx_{wake,stop}_all_queues() Brian Norris
2017-05-25  0:11 ` [PATCH 12/14] mwifiex: don't open-code ARRAY_SIZE() Brian Norris
2017-05-25  0:11 ` [PATCH 13/14] mwifiex: drop 'add_tail' param from mwifiex_insert_cmd_to_pending_q() Brian Norris
2017-05-25  0:11 ` [PATCH 14/14] mwifiex: pcie: fix whitespace Brian Norris
2017-05-31 17:10 ` Brian Norris [this message]
2017-06-05 11:49   ` [PATCH 01/14] mwifiex: pcie: properly synchronize, disable interrupts in driver callbacks Xinming Hu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170531171058.GA54739@google.com \
    --to=briannorris@chromium.org \
    --cc=amitkarwar@gmail.com \
    --cc=dmitry.torokhov@gmail.com \
    --cc=gbhat@marvell.com \
    --cc=kvalo@codeaurora.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-wireless@vger.kernel.org \
    --cc=nishants@marvell.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).