Hi Bart, On 2020-08-01 00:51, Bart Van Assche wrote: > On 2020-07-31 01:00, Can Guo wrote: >> AFAIK, sychronization of scsi_done is not a problem here, because scsi >> layer >> use the atomic state, namely SCMD_STATE_COMPLETE, of a scsi cmd to >> prevent >> the concurrency of abort and real completion of it. >> >> Check func scsi_times_out(), hope it helps. >> >> enum blk_eh_timer_return scsi_times_out(struct request *req) >> { >> ... >>         if (rtn == BLK_EH_DONE) { >>                 /* >>                  * Set the command to complete first in order to >> prevent >> a real >>                  * completion from releasing the command while error >> handling >>                  * is using it. If the command was already completed, >> then the >>                  * lower level driver beat the timeout handler, and it >> is safe >>                  * to return without escalating error recovery. >>                  * >>                  * If timeout handling lost the race to a real >> completion, the >>                  * block layer may ignore that due to a fake timeout >> injection, >>                  * so return RESET_TIMER to allow error handling >> another >> shot >>                  * at this command. >>                  */ >>                 if (test_and_set_bit(SCMD_STATE_COMPLETE, >> &scmd->state)) >>                         return BLK_EH_RESET_TIMER; >>                 if (scsi_abort_command(scmd) != SUCCESS) { >>                         set_host_byte(scmd, DID_TIME_OUT); >>                         scsi_eh_scmd_add(scmd); >>                 } >>         } >> } > > I am familiar with this mechanism. My concern is that both the regular > completion path and the abort handler must call scsi_dma_unmap() before > calling cmd->scsi_done(cmd). I don't see how > test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state) could prevent that > the regular completion path and the abort handler call scsi_dma_unmap() > concurrently since both calls happen before the SCMD_STATE_COMPLETE bit > is set? > > Thanks, > > Bart. For scsi_dma_unmap() part, that is true - we should make it serialized with any other completion paths. I've found it during my fault injection test, so I've made a patch to fix it, but it only comes in my next error recovery enhancement patch series. Please check the attachment. Thanks, Can Guo.