dmaengine.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vinod Koul <vkoul@kernel.org>
To: Dave Jiang <dave.jiang@intel.com>
Cc: Konstantin Ananyev <konstantin.ananyev@intel.com>,
	dmaengine@vger.kernel.org
Subject: Re: [PATCH v3] dmaengine: idxd: fix submission race window
Date: Wed, 14 Jul 2021 12:52:55 +0530	[thread overview]
Message-ID: <YO6Qz/ErsM/g0npk@matsya> (raw)
In-Reply-To: <162498301955.2302125.5031103655704428294.stgit@djiang5-desk3.ch.intel.com>

On 29-06-21, 09:11, Dave Jiang wrote:
> Konstantin observed that when descriptors are submitted, the descriptor is
> added to the pending list after the submission. This creates a race window
> with the slight possibility that the descriptor can complete before it
> gets added to the pending list and this window would cause the completion
> handler to miss processing the descriptor.
> 
> To address the issue, the addition of the descriptor to the pending list
> must be done before it gets submitted to the hardware. However, submitting
> to swq with ENQCMDS instruction can cause a failure with the condition of
> either wq is full or wq is not "active".
> 
> With the descriptor allocation being the gate to the wq capacity, it is not
> possible to hit a retry with ENQCMDS submission to the swq. The only
> possible failure can happen is when wq is no longer "active" due to hw
> error and therefore we are moving towards taking down the portal. Given
> this is a rare condition and there's no longer concern over I/O
> performance, the driver can walk the completion lists in order to retrieve
> and abort the descriptor.
> 
> The error path will set the descriptor to aborted status. It will take the
> work list lock to prevent further processing of worklist. It will do a
> delete_all on the pending llist to retrieve all descriptors on the pending
> llist. The delete_all action does not require a lock. It will walk through
> the acquired llist to find the aborted descriptor while add all remaining
> descriptors to the work list since it holds the lock. If it does not find
> the aborted descriptor on the llist, it will walk through the work
> list. And if it still does not find the descriptor, then it means the
> interrupt handler has removed the desc from the llist but is pending on
> the work list lock and will process it once the error path releases the
> lock.

This fails to apply on fixes branch, pls rebase

-- 
~Vinod

      parent reply	other threads:[~2021-07-14  7:23 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-29 16:11 [PATCH v3] dmaengine: idxd: fix submission race window Dave Jiang
2021-06-30 17:33 ` Ananyev, Konstantin
2021-07-14  7:22 ` Vinod Koul [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YO6Qz/ErsM/g0npk@matsya \
    --to=vkoul@kernel.org \
    --cc=dave.jiang@intel.com \
    --cc=dmaengine@vger.kernel.org \
    --cc=konstantin.ananyev@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).