Hi James: In addtion, I think the if statement in XenVbd_HwScsiResetBus, we might need use suspend_resume_state_fdo, not suspend_resume_state_pdo. Since suspend_resume_state_pdo is changed to SR_STATE_SUSPENDING, but there are still io request not finished, when reset happen, those IO can be finished. What do u think? Thanks. static BOOLEAN XenVbd_HwScsiResetBus(PVOID DeviceExtension, ULONG PathId) { PXENVBD_DEVICE_DATA xvdd = DeviceExtension; srb_list_entry_t *srb_entry; PSCSI_REQUEST_BLOCK srb; int i; UNREFERENCED_PARAMETER(DeviceExtension); UNREFERENCED_PARAMETER(PathId); FUNCTION_ENTER(); KdPrint((__DRIVER_NAME " IRQL = %d\n", KeGetCurrentIrql())); if (xvdd->ring_detect_state == RING_DETECT_STATE_COMPLETE && xvdd->device_state->suspend_resume_state_pdo == SR_STATE_RUNNING) *********this line { while((srb_entry = (srb_list_entry_t *)RemoveHeadList(&xvdd->srb_list)) != (srb_list_entry_t *)&xvdd->srb_list) { srb = srb_entry->srb; srb->SrbStatus = SRB_STATUS_BUS_RESET; KdPrint((__DRIVER_NAME " completing queued SRB %p with status SRB_STATUS_BUS_RESET\n", srb)); ScsiPortNotification(RequestComplete, xvdd, srb); } >> Subject: RE: PV resume failed after self migration failed >> Date: Wed, 22 Jun 2011 14:06:18 +1000 >> From: james.harper@bendigoit.com.au >> To: tinnycloud@hotmail.com; xen-devel@lists.xensource.com >> >> > > >> > > The xenvbd driver doesn't do any timeout, windows does the timeout >> and >> > > tells xenvbd to reset. I haven't tested the scenario you describe >> very >> > > recently, and xenvbd is now two different drivers, one for scsiport >> (<= >> > > 2003) and one for storport (>= Vista), so there could be bugs in >> either. >> > > >> > >> > The bug can be reproduced in 2003 32bit system. We are using scsi >> driver. >> > I put some log in XenVbd_HwScsiResetBus to see if there are not >> completed >> > srb(Like below) >> > but I didn't see the log when XenVbd_HwScsiResetBus called. So No IO >> is in >> > queue. >> >> Just to confirm, is this the issue that only happens when the migration >> fails in xen and is cancelled? >> >>Exactly. >>I've noticed some difference in log. > >In normal resuming, from the log, we can see event port assign like below: >pdo_event_channel = 5 (Notifying event channel 5) >suspend event channel = 6 >XEN_INIT_TYPE_EVENT_CHANNEL - event-channel = 7 (for VBD) >XEN_INIT_TYPE_EVENT_CHANNEL - event-channel = 8 (VIF) > >>when guest resuming locally from suspend(that is migration failed in xen, guest >>has already suspended, so it need resuming) > >>pdo_event_channel = 7 ( Notifying event channel 7) >>suspend event channel = 8 >>XEN_INIT_TYPE_EVENT_CHANNEL - event-channel = 9 (vif) > >>VBD port is not allocated, since pdo is waiting fdo change. > >>It looks like port 5 and 6 is still occpuied, or pdo_event_channel bind twice? > >it works when I unbind pdo_event_channel & suspend_evtchn. >=================================================================== >--- xenpci_fdo.c (revision 4304) >+++ xenpci_fdo.c (working copy) >@@ -656,6 +656,12 @@ > } > WdfChildListEndIteration(child_list, &child_iterator); > >+ EvtChn_Unbind(xpdd, xpdd->pdo_event_channel); >+ EvtChn_Close(xpdd, xpdd->pdo_event_channel); >+ >+ EvtChn_Unbind(xpdd, xpdd->suspend_evtchn); >+ EvtChn_Close(xpdd, xpdd->suspend_evtchn); >+ > XenBus_Suspend(xpdd); > EvtChn_Suspend(xpdd); > XenPci_HighSync(XenPci_Suspend0, XenPci_SuspendN, xpdd); > > >BTW, is there a missing "break" in XenVbd_HwScsiInterrupt, xenvbd_scsiport.c:928 >before default? Well, it is harmless. > >924 case SR_STATE_RUNNING: >925 KdPrint((__DRIVER_NAME " New pdo state %d\n", suspend_resume_state_pdo)); >926 xvdd->device_state->suspend_resume_state_fdo = suspend_resume_state_pdo; >927 xvdd->vectors.EvtChn_Notify(xvdd->vectors.context, xvdd->device_state->pdo_event_channel); >928 ScsiPortNotification(NextRequest, DeviceExtension); >929 default: >930 KdPrint((__DRIVER_NAME " New pdo state %d\n", suspend_resume_state_pdo)); >931 xvdd->device_state->suspend_resume_state_fdo = suspend_resume_state_pdo; >932 xvdd->vectors.EvtChn_Notify(xvdd->vectors.context, xvdd->device_state->pdo_event_channel); >933 break; > >Thanks. >>> James >>> >