From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alan Stern Subject: Re: [PATCH 0/3] Fix USB deadlock caused by SCSI error handling Date: Mon, 7 Apr 2014 11:26:45 -0400 (EDT) Message-ID: References: <533BA835.7050508@suse.de> Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Return-path: Received: from netrider.rowland.org ([192.131.102.5]:35682 "HELO netrider.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1751946AbaDGP0q (ORCPT ); Mon, 7 Apr 2014 11:26:46 -0400 In-Reply-To: <533BA835.7050508@suse.de> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Hannes Reinecke Cc: James Bottomley , SCSI development list , USB list On Wed, 2 Apr 2014, Hannes Reinecke wrote: > On 04/01/2014 11:28 PM, Alan Stern wrote: > > On Tue, 1 Apr 2014, Hannes Reinecke wrote: > > > >>>> So if the above reasoning is okay then this patch should be doing > >>>> the trick: > >>>> > >>>> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > >>>> index 771c16b..0e72374 100644 > >>>> --- a/drivers/scsi/scsi_error.c > >>>> +++ b/drivers/scsi/scsi_error.c > >>>> @@ -189,6 +189,7 @@ scsi_abort_command(struct scsi_cmnd *scmd) > >>>> /* > >>>> * Retry after abort failed, escalate to next level. > >>>> */ > >>>> + scmd->eh_eflags &= ~SCSI_EH_ABORT_SCHEDULED; > >>>> SCSI_LOG_ERROR_RECOVERY(3, > >>>> scmd_printk(KERN_INFO, scmd, > >>>> "scmd %p previous abort > >>>> failed\n", scmd)); > >>>> > >>>> (Beware of line > >>>> breaks) > >>>> > >>>> Can you test with it? > >>> > >>> I don't understand. This doesn't solve the fundamental problem (namely > >>> that you escalate before aborting a running command). All it does is > >>> clear the SCSI_EH_ABORT_SCHEDULED flag before escalating. > >>> > >> Which was precisely the point :-) > >> > >> Hmm. The comment might've been clearer. > >> > >> What this patch is _supposed_ to be doing is that it'll clear the > >> SCSI_EH_ABORT_SCHEDULED flag it it has been set. > >> Which means this will be the second time scsi_abort_command() has > >> been called for the same command. > >> IE the first abort went out, did its thing, but now the same command > >> has timed out again. > >> > >> So this flag gets cleared, and scsi_abort_command() returns FAILED, > >> and _no_ asynchronous abort is being scheduled. > >> scsi_times_out() will then proceed to call scsi_eh_scmd_add(). > >> But as we've cleared the SCSI_EH_ABORT_SCHEDULED flag > >> the SCSI_EH_CANCEL_CMD flag will continue to be set, > >> and the command will be aborted with the main SCSI EH routine. > >> > >> It looks to me as if it should do what you desire, namely abort the > >> command asynchronously the first time, and invoking the SCSI EH the > >> second time. > >> > >> Am I wrong? > > > > I don't know -- I'll have to try it out. Currently I'm busy with a > > bunch of other stuff, so it will take some time. I finally got a chance to try it out. It does seem to do what we want. I didn't track the flow of control in complete detail, but the command definitely got aborted both times it was issued. Alan Stern