All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konstantin Shelekhin <k.shelekhin@yadro.com>
To: Mike Christie <michael.christie@oracle.com>
Cc: <target-devel@vger.kernel.org>, <linux@yadro.com>,
	Maurizio Lombardi <mlombard@redhat.com>
Subject: Re: iSCSI Abort Task and WRITE PENDING
Date: Tue, 19 Oct 2021 00:50:22 +0300	[thread overview]
Message-ID: <YW3sHlx6P+B6Jqjy@yadro.com> (raw)
In-Reply-To: <1858f7c3-4874-6931-da3a-12518aa36719@oracle.com>

On Mon, Oct 18, 2021 at 03:34:44PM -0500, Mike Christie wrote:
> On 10/18/21 3:20 PM, Mike Christie wrote:
> > On 10/18/21 12:32 PM, Konstantin Shelekhin wrote:
> >> On Mon, Oct 18, 2021 at 11:29:23AM -0500, Mike Christie wrote:
> >>> On 10/18/21 6:56 AM, Konstantin Shelekhin wrote:
> >>>> On Thu, Oct 14, 2021 at 10:18:13PM -0500, michael.christie@oracle.com wrote:
> >>>>>> If I understand this aproach correctly, it fixes the deadlock, but the
> >>>>>> connection reinstatement will still happen, because WRITE_10 won't be
> >>>>>> aborted and the connection will go down after the timeout.> 
> >>>>>> IMO it's not ideal either, since now iSCSI will have a 50% chance to
> >>>>>> have the connection (meaning SCSI session) killed on arbitrary ABOR
> >>>>>
> >>>>> I wouldn't call this an arbitrary abort. It's indicating a problem.
> >>>>> When do you see this? Why do we need to fix it per cmd? Are you hitting
> >>>>> the big command short timeout issue? Driver/fw bug?
> >>>>
> >>>> It was triggered by ESXi. During some heavy IOPS intervals the backend
> >>>> device cannot handle the load and some IOs get stuck for more than 30
> >>>> seconds. I suspect that ABORT TASKSs are issued by the virtual machines.
> >>>> So a series of ABORT TASK will come, and the unlucky one will hit the
> >>>> issue.
> >>>
> >>> I didn't get this. If only the backend is backed up then we should
> >>> still be transmitting the data out/R2Ts quickly and we shouldn't be
> >>> hitting the issue where we got stuck waiting on them.
> >>
> >> We stuck waiting on them because the initiator will not send Data-Out
> > 
> > We are talking about different things here. Above I'm just asking about what
> > leads to the cmd timeout.
> 
> Oh wait, I miss understood the "almost immediately" part in your #3.
> 
> Just tell me if you are running iscsi in the guest or hypervisor and if
> the latter what version of ESXi,

ESXi 6.7 is connected over iSCSI. It uses the block device for
datastore.
 
> > 
> > You wrote before the abort is sent the backend gets backed up, and the back
> > up causes IO to take long enough for the initiator cmd timeout to fire.
> > I'm asking why before the initiator side cmd timeout and before the abort is sent,
> > why aren't R2T/data_outs executing quickly if only the backend is backed up?
> > 
> > Is it the bug I mentioned where one of the iscsi threads is stuck on the
> > submission to the block layer, so that thread can't handle iscsi IO?
> > If so I have a patch for that.
> > 
> > I get that once the abort is sent we hit these other issues.
> > 
> > 
> >> PDUs after sending ABORT TASK:
> >>
> >>   1. Initiator sends WRITE CDB
> >>   2. Target sends R2T
> >>   3. Almost immediately Initiator decides to abort the request and sends
> > 
> > Are you using iscsi in the VM or in the hypervisor? For the latter is
> > timeout 15 seconds for normal READs/WRITEs? What version of ESXi?
> > 
> >>      ABORT TASK without sending any further Data-Out PDUs (maybe except for
> >>      the first one); I believe it happens because the initiator tries to
> >>      abort a larger batch of requests, and this unlucky request is just
> >>      the last in series
> >>   4. Target still waits for Data-Out PDUs and times out on Data-Out timer
> >

  reply	other threads:[~2021-10-18 21:50 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-13 13:21 iSCSI Abort Task and WRITE PENDING Konstantin Shelekhin
2021-10-13 14:22 ` Hannes Reinecke
2021-10-13 14:53   ` Konstantin Shelekhin
2021-10-13 14:56     ` Konstantin Shelekhin
2021-10-14  7:09     ` Hannes Reinecke
2021-10-14  7:52       ` Konstantin Shelekhin
2021-10-13 17:51 ` Mike Christie
2021-10-13 18:05   ` Mike Christie
2021-10-13 18:11     ` Konstantin Shelekhin
2021-10-13 18:08   ` Konstantin Shelekhin
2021-10-13 18:24     ` Mike Christie
2021-10-13 18:30       ` Mike Christie
2021-10-13 18:58         ` Konstantin Shelekhin
2021-10-13 19:01           ` Konstantin Shelekhin
2021-10-13 20:21             ` Mike Christie
2021-10-14 23:12               ` Konstantin Shelekhin
2021-10-15  3:18                 ` michael.christie
2021-10-18 11:56                   ` Konstantin Shelekhin
2021-10-18 16:29                     ` Mike Christie
2021-10-18 17:08                       ` Mike Christie
2021-10-26 10:59                         ` Konstantin Shelekhin
2021-10-18 17:32                       ` Konstantin Shelekhin
2021-10-18 20:20                         ` Mike Christie
2021-10-18 20:34                           ` Mike Christie
2021-10-18 21:50                             ` Konstantin Shelekhin [this message]
2021-10-18 21:48                           ` Konstantin Shelekhin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YW3sHlx6P+B6Jqjy@yadro.com \
    --to=k.shelekhin@yadro.com \
    --cc=linux@yadro.com \
    --cc=michael.christie@oracle.com \
    --cc=mlombard@redhat.com \
    --cc=target-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.