All of lore.kernel.org
 help / color / mirror / Atom feed
* BUG: scsi/qla2xxx: scsi_done from qla2x00_sp_compl race with scsi_queue_insert from abort handler
@ 2017-11-08  6:29 jianchao.wang
  0 siblings, 0 replies; only message in thread
From: jianchao.wang @ 2017-11-08  6:29 UTC (permalink / raw)
  To: qla2xxx-upstream; +Cc: jejb, martin.petersen, linux-scsi, linux-kernel

Hi 

[1.] One line summary of the problem:
scsi_done from qla2x00_sp_compl race with scsi_queue_insert from scmd_eh_abort_handler()
then cause the BUG_ON(blk_queued_rq(req)) trigger.

[2.] Full description of the problem/report:
The detailed scene is as following:
cpu A                                                 cpu B
kworker                                               interrupt
                                                     
scmd_eh_abort_handler()                               
  -> scsi_try_to_abort_cmd()                   qla2x00_status_entry()   
    -> qla2xxx_eh_abort()                        -> res = DID_RESET << 16 //case CS_ABORTED   
      -> qla2x00_eh_wait_on_command()            -> qla2x00_sp_compl()                      
        >>>>                                       +-> cmd->result = res; // --->P0_B
          while (CMD_SP(cmd) && wait_iter--) {     +-> qla2x00_sp_free_dma()             
            msleep(ABORT_POLLING_PERIOD);          |    -> CMD_SP(cmd) = NULL                             
          }                                        |             
        >>>>                                       |
  -> scsi_noretry_cmd() //--->P2_A                 |
  -> scsi_queue_insert()                           +-> scsi_done()
    -> __scsi_queue_insert()                         -> blk_complete_request()
      -> cmd->result = 0; // --->P0_A                  |
        -> blk_requeue_request()                       |
          -> blk_clear_rq_complete() // --->P1_A       |
            -> elv_requeue_request()                   |
              -> __elv_add_request()                   +-> blk_mark_rq_complete() //  ----> P1_B
                // req will be queued here             +-> __blk_complete_request()
                                                    BLK_SOFTIRQ
                                                    scsi_softirq_done()
                                                      -> scsi_decide_disposition() // ----> P2_B
                                                      // scsi_cmnd->result is set to 0 in P0_A
                                                      // so it will return SUCCESS
                                                        -> scsi_finish_command()
                                                          -> scsi_io_completion()
                                                            -> scsi_end_request()
                                                              -> blk_finish_request()  // BUG_ON(blk_queued_rq(req)) !!!

The scsi_cmd->result was set to DID_RESET << 16 at P0_B.
At P2_A, the scsi_noretry_cmd() would return 0 and scsi_queue_insert() would be invoked.
When the blk_clear_rq_complete() at P1_A was invoked before the blk_mark_rq_complete()
at P1_B, __blk_complete_request() would be invoked in cpu B.
At P2_B, the scsi_decide_disposition() will return SUCCESS as the scsi_cmd->result had been set to 0 at P0_A.
Finally, the BUG_ON(blk_queued_rq(req)) in blk_finish_request() was triggered.

Whether the scsi_done should be invoked in qla2xxx_sp_compl() when scsi_cmd is aborted ?
When the P1_B is before P1_A, it will do nothing, otherwise, the panic comes up.

Thanks
Jianchao

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2017-11-08  6:29 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-08  6:29 BUG: scsi/qla2xxx: scsi_done from qla2x00_sp_compl race with scsi_queue_insert from abort handler jianchao.wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.