All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shannon Nelson <shannon.nelson@oracle.com>
To: intel-wired-lan@osuosl.org
Subject: [Intel-wired-lan] [next PATCH S75-V2 04/12] i40e: synchronize nvmupdate command and adminq subtask
Date: Wed, 12 Jul 2017 08:28:23 -0700	[thread overview]
Message-ID: <be57f6f0-4007-a264-54ab-da9e68e06926@oracle.com> (raw)
In-Reply-To: <20170711120118.44095-4-alice.michael@intel.com>



On 7/11/2017 5:01 AM, Alice Michael wrote:
> From: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
> 
> During NVM update, state machine gets into unrecoverable state because
> i40e_clean_adminq_subtask can get scheduled after the admin queue
> command but before other state variables are updated. This causes
> incorrect input to i40e_nvmupd_check_wait_event and state transitions
> don't happen.
> 
> This issue existed before but surfaced after commit 373149fc99a0
> ("i40e: Decrease the scope of rtnl lock")

I had a feeling that patch might bite you.  I suspect there may still be 
some other occasional timing issues cropping up.

> 
> This fix adds locking around admin queue command and update of
> state variables so that adminq_subtask will have accurate information
> whenever it gets scheduled.
> 
> Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
> ---
>   drivers/net/ethernet/intel/i40e/i40e_nvm.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_nvm.c b/drivers/net/ethernet/intel/i40e/i40e_nvm.c
> index 17607a2..04f2192 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_nvm.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_nvm.c
> @@ -753,6 +753,11 @@ i40e_status i40e_nvmupd_command(struct i40e_hw *hw,
>   		hw->nvmupd_state = I40E_NVMUPD_STATE_INIT;
>   	}
>   
> +	/* Acquire lock to prevent race condition where adminq_task
> +	 * can execute after i40e_nvmupd_nvm_read/write but before state
> +	 * variables (nvm_wait_opcode, nvm_release_on_done) are updated
> +	 */
> +	mutex_lock(&hw->aq.arq_mutex);

Have you done any testing to see how long you might end up holding this 
lock?  I suppose it is limited by the max length of the synchronous AQ 
polling timeout.  You might mention that maximum time limitation here or 
in the commit notes, since this is a mutex over a possibly long I/O 
operation.

>   	switch (hw->nvmupd_state) {
>   	case I40E_NVMUPD_STATE_INIT:
>   		status = i40e_nvmupd_state_init(hw, cmd, bytes, perrno);
> @@ -788,6 +793,7 @@ i40e_status i40e_nvmupd_command(struct i40e_hw *hw,


There's a return statement in the *_WAIT cases that should have a 
mutex_unlock(), or should have a goto to the unlock at the end of the 
function, or you'll end up never again receiving AR events.

sln

>   		*perrno = -ESRCH;
>   		break;
>   	}
> +	mutex_unlock(&hw->aq.arq_mutex);
>   	return status;
>   }
>   
> 

  reply	other threads:[~2017-07-12 15:28 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-11 12:01 [Intel-wired-lan] [next PATCH S75-V2 01/12] i40evf: use netdev variable in reset task Alice Michael
2017-07-11 12:01 ` [Intel-wired-lan] [next PATCH S75-V2 02/12] i40e: use cpumask_copy instead of direct assignment Alice Michael
2017-07-11 12:01 ` [Intel-wired-lan] [next PATCH S75-V2 03/12] i40e: prevent changing ITR if adaptive-rx/tx enabled Alice Michael
2017-07-11 12:01 ` [Intel-wired-lan] [next PATCH S75-V2 04/12] i40e: synchronize nvmupdate command and adminq subtask Alice Michael
2017-07-12 15:28   ` Shannon Nelson [this message]
2017-07-11 12:01 ` [Intel-wired-lan] [next PATCH S75-V2 05/12] i40e: Store the requested FEC information Alice Michael
2017-07-11 12:01 ` [Intel-wired-lan] [next PATCH S75-V2 07/12] i40e: Use correct flag to enable egress traffic for unicast promisc Alice Michael
2017-07-11 12:01 ` [Intel-wired-lan] [next PATCH S75-V2 08/12] i40evf: fix possible snprintf truncation of q_vector->name Alice Michael
2017-07-11 12:01 ` [Intel-wired-lan] [next PATCH S75-V2 09/12] i40e: force VMDQ device name truncation Alice Michael
2017-07-11 12:01 ` [Intel-wired-lan] [next PATCH S75-V2 10/12] i40e/i40evf: support for VF VLAN tag stripping control Alice Michael
2017-07-11 12:01 ` [Intel-wired-lan] [next PATCH S75-V2 11/12] i40e: 25G FEC status improvements Alice Michael
  -- strict thread matches above, loose matches on Subject: below --
2017-07-13  0:37 [Intel-wired-lan] [next PATCH S75-V2 04/12] i40e: synchronize nvmupdate command and adminq subtask Mogilappagari, Sudheer
2017-06-29  8:36 [Intel-wired-lan] [next PATCH S75-V2 01/12] i40evf: use netdev variable in reset task Alice Michael
2017-06-29  8:36 ` [Intel-wired-lan] [next PATCH S75-V2 04/12] i40e: synchronize nvmupdate command and adminq subtask Alice Michael

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=be57f6f0-4007-a264-54ab-da9e68e06926@oracle.com \
    --to=shannon.nelson@oracle.com \
    --cc=intel-wired-lan@osuosl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.